├── .gitattributes
├── .gitignore
├── 000 - change the world.jpg
├── 000a - Data Science Stack.md
├── 000c - Medicare Blog Post.md
├── 000d - SQL Regex.md
├── 000e - Pesticides.md
├── 000f - Picking a College Using Data.md
├── 001 - Personal Security.md
├── 002 - California Water.md
├── 003 - spot pricing 3.md
├── 005 - spot pricing 2.md
├── 005 - spot pricing 4.md
├── 008 - Car Analysis Three.md
├── 008 - Car Prices.xlsx
├── 008 - Car Quality Stats.xlsx
├── 008 - Car-Models.xlsx
├── 008 - Cars-For-Sale.xlsx
├── 010 - CameraAwesomePhoto (1).jpg
├── 010 - CameraAwesomePhoto (2).jpg
├── 010 - Ethics Inputs.md
├── 010 - equality vs equity.jpg
├── 010 - food lifecycle.jpg
├── 010 - poor deserve the best.png
├── 010 - spider man.jpg
├── 015 - AnnualSalary2010thru2013.csv
├── 015 - Tuition by Raw Numbers.xlsx
├── 015 - UW Salary Info.xlsx
├── 015 - UW Student Tuition Plan.pdf
├── 015 - University Salary Analysis.md
├── 016 - College from First Principles.md
├── 020 - Industries.pdf
├── 020 - What is Ethical.md
├── 020 - ally bingo.jpg
├── 025 - Why Ethics.md
├── 025 - ethics.jpg
├── 030 - Looking for Ethical Work.md
├── 040 - Ethics for Data Pros.md
├── 040 - ML companies.png
├── 040 - NIST.SP.800-179.pdf
├── 040 - data ethics slide.jpg
├── 040 - ead_v1.pdf
├── 040 - ethical tech.jpg
├── 040 - the scored society.pdf
├── 050 - Privacy.markdown
├── 050 - The Database of Ruin.md
├── 050 - online tracking.jpg
├── 060 - Precautionary Principle.md
├── 070 - Environmental Engineering.md
├── 070 - climate change viz.jpg
├── 070 - space ceded to cars.jpg
├── 080 - Music and Engineering.md
├── 100 - Google Wagging the Dog.md
├── 100 - map-reduce funny.png
├── 110 - Big_Data_Landscape.png
├── 110 - Data Stack.md
├── 110 - data stack hadoop cat.png
├── 110 - data stack.jpg
├── 115 - Cost of Complexity.jpg
├── 115 - Systems Thinking.md
├── 120 - Cloud Computing incl Spot.md
├── 120 - cloud computing dev 1.JPG
├── 120 - cloud computing dev 2.JPG
├── 130 - Interdisciplinary.md
├── 135 - Curiosity and Ego.md
├── 140 - Net Neutrality.md
├── 150 - MonteCarlo.R
├── 150 - Project Paradox.png
├── 150 - Project Planning.markdown
├── 150 - Scratch.R
├── 151 - Project Planning 2.md
├── 152 - Project Planning 3.md
├── 153 - Project Planning 4.md
├── 160 - Communication and Storytelling.md
├── 160 - linkbait effectiveness.png
├── 170 - Getting Started With Programmind.md
├── 180 - Incentives.md
├── 180 - incentives.jpg
├── 190 - Project Animal Names.jpg
├── 190 - Project Names 2.jpg
├── 190 - Project Names.md
├── 200 - data viz.jpg
├── 200 - data viz.md
├── 2013-12-08-pagerank scale.md
├── 2013-12-28 Productivity Analysis.xlsx
├── 2013-12-28-productivity.md
├── 2014-02-23-hal-varian.md
├── 2014-05-19-social-network.md
├── 2014-06-01-democratization-of-bi.md
├── 210 - System Replacements.md
├── 220 - Personal Automation.md
├── 220 - software architecture.png
├── 230 - Association Rules in SQL AdventureWorks 2012.sql
├── 230 - Basic ML Using SQL.markdown
├── 230 - CameraAwesomePhoto.jpg
├── 240 - SQL and Digraphs.markdown
├── 250 - Finding a Vacation Using Data.markdown
├── 260 - Startups and Y Combinator.markdown
├── 270 - Smell Test Dilbert.jpg
├── 270 - The Smell Test.md
├── 280 - Agile and Waterfall.md
├── 290 - Data Science Evolution.md
├── 320 - Feature Engineering.md
├── 330 - Cognition for Data Professionals.md
├── 330 - Cognition for Data Pros.png
├── 330 - know all the things.jpg
├── 350 - Example Math.xlsx
├── 350 - Matrix Prioritizaton.md
├── 360 - Life is an Optimization Problem.md
├── 370 - Industry Comparisons.md
├── 380 - Trust.md
├── 400 - Software as a Craft.markdown
├── 400 - software.jpg
├── 410 - Scientific Method.markdown
├── 420 - Inductive vs Deductive.markdown
├── 430 - Hiring.md
├── 431 - CV of Failures.pdf
├── 431 - Job Searches.md
├── 431 - job searches as developer.png
├── 431 - resume viz.png
├── 431- interviewing honesty.jpg
├── 431-decoding-job-descriptions.jpg
├── 432 - Bad Work Situations.md
├── 432 - fail.jpg
├── 440 - Database Development.markdown
├── 450 - Engineering Constraints.markdown
├── 460 - Reputation Systems and PageRank.markdown
├── 470 - Amazon.md
├── 480 - Minorities in Technology.md
├── 480 - computing women.jpg
├── 480 - lego_gender.jpg
├── 480 - racism and bigotry.jpg
├── 480 - recruiting WIT.jpg
├── 480 - what happens we're out.png
├── 480 - women_astronomer.jpg
├── 480- perfectcrime.png
├── 490 - Chart of Cosmic Exploration.jpg
├── 490 - Science and Research.md
├── 490 - scientific method.jpg
├── 490 - what would feynman.png
├── 500 - Intro to Caching and Core Algos.markdown
├── 501 - Moore's Law.md
├── 502 - Self-Documenting Code.md
├── 510 - Analysis of Brilliant People.markdown
├── 510 - Brilliant People.png
├── 510 - Smart People Traits.xlsx
├── 520 - Find a Health Using Data.md
├── 520 - Natural Food Remedies Notes.txt
├── 520 - healthy foods.jpg
├── 520.jpg
├── 530 - 10 commands of architecture.markdown
├── 540 - Learning and Retention Methods.markdown
├── 550 - SQL on RDS.markdown
├── 560 - Balance.markdown
├── 570 - Housing Using Data.md
├── 600 - Advanced ETL Approaches.markdown
├── 610 - ETL tips and Tricks.markdown
├── 620 - Data Science Intro.markdown
├── 621 - SQLSatRedmond - ML For Mere Mortals.markdown
├── 621 - photo.JPG
├── 640 - Making Data Friendly Organizations.markdown
├── 650 - Data To Decisions Education Abstract.html
├── 650 - Data to Decisions Ed abstract.md
├── 650 - Data to Decisions for Education.md
├── 700 - autotrader_scrape.py
├── 9900 - Cloud Uploads.jpg
├── 9900 - Graphical Models.PDF
├── 9900 - IT_roles.jpg
├── 9900 - Programming Links.md
├── 9900 - commit linkbait.jpg
├── 9900 - complexity kills.jpg
├── 9900 - complexity.jpg
├── 9900 - devops and security.jpg
├── 9900 - enterprise-it.png
├── 9900 - git undo flowchart.png
├── 9900 - ie-must-die.jpg
├── 9900 - javascript.png
├── 9900 - linux perf tools.jpg
├── 9900 - multithreading.jpg
├── 9900 - programmer_style.png
├── 9900 - programming spec.jpg
├── 9900 - reading software.png
├── 9900 - software-engineer.jpg
├── 9900 - stackoverflow.png
├── 9900 - wicked problems.jpg
├── 9901 - GDP vs GNH.jpg
├── 9901 - Productivity Links.md
├── 9901 - Smartphone Crossing.jpg
├── 9901 - learning stages.jpg
├── 9901 - profanity motivation.jpg
├── 9902 - Career and Branding.md
├── 9903 - CEO streamlining.jpg
├── 9903 - Finance Links.md
├── 9903 - Robots and labor.jpg
├── 9903 - counter-Varian Rule.jpg
├── 9903 - trickle down economics.jpg
├── 9904 - 2016-12-9-gans.pdf
├── 9904 - Big Data Deities.png
├── 9904 - Data Science and Engineering Links.md
├── 9904 - Overfitting diagram.jpg
├── 9904 - RoadToDataScientist1.png
├── 9904 - Scikit_Learn_Cheat_Sheet_Python.pdf
├── 9904 - data science funny.jpg
├── 9904 - data science over time.png
├── 9904 - data science skills venn.jpg
├── 9904 - data viz.png
├── 9904 - data-science-venn-diagram.jpg
├── 9904 - machine learning industry.png
├── 9904 - ml libraries.png
├── 9904 - never use piece charts.jpg
├── 9904 - stats-trick-question.jpg
├── 9904 - storytelling.jpg
├── 9904 - tools.jpg
├── 9904.jpg
├── 9905 - Grief.md
├── 9905 - Parenting Chores over Time.jpg
├── 9905 - Parenting Iron Triangle.jpg
├── 9905 - Personal Life Links.md
├── 9905 - money and time.jpg
├── 9905 - no hipsters.jpg
├── 9905 - why people become unhappy.jpeg
├── 9906 - Security Links.md
├── 9906 - time to crack password.png
├── 9907 - Health Links.md
├── 9907 - Salad Ideas.md
├── 9907 - cheese wheel.jpg
├── 9907 - dentist prices.pdf
├── 9907 - growth of hospital admins.png
├── 9907 - overweight.jpg
├── 9907 - recipe recommendation ML.pdf
├── 9907 - vaccines.gif
├── 9908 - Academia Misincentives.jpg
├── 9908 - College and Career.jpg
├── 9908 - Education Links.md
├── 9908 - NFL odds.jpg
├── 9908 - academic minions.jpg
├── 9908 - education retention.jpg
├── 9908 - game of loans.jpg
├── 9908 - goal of education.jpg
├── 9908 - incentives.jpg
├── 9908 - teacher feedback funny.jpg
├── 9908 - textbooks.png
├── 9908.jpg
├── 9909 - Education Reform Warnings.pdf
├── 9909 - Leadership and Management Links.md
├── 9909 - Skunk Works Leadership.png
├── 9909 - get out of the way.jpg
├── 9909 - org charts.jpg
├── 9909 - typical conversation with managers.webm
├── 9910 - Startups.md
├── 9910 - mvp.png
├── 9910 - sick burn by new yorker.jpg
├── 9911 - CBP Task Group Out-brief Slides_FINAL.pdf
├── 9911 - ComparisonOfVotingSystems.png
├── 9911 - Government.md
├── 9911 - Terrorism causes.png
├── 9911 - police and recording.jpg
├── 9912 - Intellectual Property.md
├── 9913 - Companies.md
├── 9913 - Net Neutrality.png
├── 9913 - coca cola.png
├── 9913 - misbehaving.jpg
├── 9914 - Privacy and Security.md
├── 9914 - privacy vs security.jpg
├── 9915 - dont shoot.png
├── 9916 - Police and the Justice System.md
├── 9916 - how to survive police encounters.jpg
├── 9917 - Military.md
├── Archive
├── 140 - Nonprofit_Revenue_-_Donation_Cannibalization.pdf
├── 140 - Seattle Art Museum.markdown
├── 170 - Seattle Aquarium.md
├── 2014-05-08-keynote-one.md
├── 2014-05-08-keynote-two.md
├── 2014-05-13-passbac survey.xlsx
├── 2014-06-11-passbac.md
├── 2014-07-01-tsql-tuesday.md
├── NodeXL graphs.md
└── uw - 010 - introduction.md
├── Genome Science Blog Post.md
├── List of things I still can't do in November 2014.md
├── Principles_of_Performance_Tuning.md
├── README.md
├── company size and culture.png
├── crime-vs-incarceration.jpg
├── darwin award.jpg
├── data bias.jpg
├── einstein_ethics.jpg
├── equal-vs-fair.png
├── math_for_grownups.jpg
├── mechanical_calculator.jpg
├── precision-and-recall.jpg
├── resistance is just.jpg
├── student_debt.jpg
└── wolf debt.png
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
4 | # Custom for Visual Studio
5 | *.cs diff=csharp
6 | *.sln merge=union
7 | *.csproj merge=union
8 | *.vbproj merge=union
9 | *.fsproj merge=union
10 | *.dbproj merge=union
11 |
12 | # Standard to msysgit
13 | *.doc diff=astextplain
14 | *.DOC diff=astextplain
15 | *.docx diff=astextplain
16 | *.DOCX diff=astextplain
17 | *.dot diff=astextplain
18 | *.DOT diff=astextplain
19 | *.pdf diff=astextplain
20 | *.PDF diff=astextplain
21 | *.rtf diff=astextplain
22 | *.RTF diff=astextplain
23 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | #################
2 | ## Eclipse
3 | #################
4 |
5 | *.pydevproject
6 | .project
7 | .metadata
8 | bin/
9 | tmp/
10 | *.tmp
11 | *.bak
12 | *.swp
13 | *~.nib
14 | local.properties
15 | .classpath
16 | .settings/
17 | .loadpath
18 |
19 | # External tool builders
20 | .externalToolBuilders/
21 |
22 | # Locally stored "Eclipse launch configurations"
23 | *.launch
24 |
25 | # CDT-specific
26 | .cproject
27 |
28 | # PDT-specific
29 | .buildpath
30 |
31 |
32 | #################
33 | ## Visual Studio
34 | #################
35 |
36 | ## Ignore Visual Studio temporary files, build results, and
37 | ## files generated by popular Visual Studio add-ons.
38 |
39 | # User-specific files
40 | *.suo
41 | *.user
42 | *.sln.docstates
43 |
44 | # Build results
45 |
46 | [Dd]ebug/
47 | [Rr]elease/
48 | x64/
49 | build/
50 | [Bb]in/
51 | [Oo]bj/
52 |
53 | # MSTest test Results
54 | [Tt]est[Rr]esult*/
55 | [Bb]uild[Ll]og.*
56 |
57 | *_i.c
58 | *_p.c
59 | *.ilk
60 | *.meta
61 | *.obj
62 | *.pch
63 | *.pdb
64 | *.pgc
65 | *.pgd
66 | *.rsp
67 | *.sbr
68 | *.tlb
69 | *.tli
70 | *.tlh
71 | *.tmp
72 | *.tmp_proj
73 | *.log
74 | *.vspscc
75 | *.vssscc
76 | .builds
77 | *.pidb
78 | *.log
79 | *.scc
80 |
81 | # Visual C++ cache files
82 | ipch/
83 | *.aps
84 | *.ncb
85 | *.opensdf
86 | *.sdf
87 | *.cachefile
88 |
89 | # Visual Studio profiler
90 | *.psess
91 | *.vsp
92 | *.vspx
93 |
94 | # Guidance Automation Toolkit
95 | *.gpState
96 |
97 | # ReSharper is a .NET coding add-in
98 | _ReSharper*/
99 | *.[Rr]e[Ss]harper
100 |
101 | # TeamCity is a build add-in
102 | _TeamCity*
103 |
104 | # DotCover is a Code Coverage Tool
105 | *.dotCover
106 |
107 | # NCrunch
108 | *.ncrunch*
109 | .*crunch*.local.xml
110 |
111 | # Installshield output folder
112 | [Ee]xpress/
113 |
114 | # DocProject is a documentation generator add-in
115 | DocProject/buildhelp/
116 | DocProject/Help/*.HxT
117 | DocProject/Help/*.HxC
118 | DocProject/Help/*.hhc
119 | DocProject/Help/*.hhk
120 | DocProject/Help/*.hhp
121 | DocProject/Help/Html2
122 | DocProject/Help/html
123 |
124 | # Click-Once directory
125 | publish/
126 |
127 | # Publish Web Output
128 | *.Publish.xml
129 | *.pubxml
130 | *.publishproj
131 |
132 | # NuGet Packages Directory
133 | ## TODO: If you have NuGet Package Restore enabled, uncomment the next line
134 | #packages/
135 |
136 | # Windows Azure Build Output
137 | csx
138 | *.build.csdef
139 |
140 | # Windows Store app package directory
141 | AppPackages/
142 |
143 | # Others
144 | sql/
145 | *.Cache
146 | ClientBin/
147 | [Ss]tyle[Cc]op.*
148 | ~$*
149 | *~
150 | *.dbmdl
151 | *.[Pp]ublish.xml
152 | *.pfx
153 | *.publishsettings
154 |
155 | # RIA/Silverlight projects
156 | Generated_Code/
157 |
158 | # Backup & report files from converting an old project file to a newer
159 | # Visual Studio version. Backup files are not needed, because we have git ;-)
160 | _UpgradeReport_Files/
161 | Backup*/
162 | UpgradeLog*.XML
163 | UpgradeLog*.htm
164 |
165 | # SQL Server files
166 | App_Data/*.mdf
167 | App_Data/*.ldf
168 |
169 | #############
170 | ## Windows detritus
171 | #############
172 |
173 | # Windows image file caches
174 | Thumbs.db
175 | ehthumbs.db
176 |
177 | # Folder config file
178 | Desktop.ini
179 |
180 | # Recycle Bin used on file shares
181 | $RECYCLE.BIN/
182 |
183 | # Mac crap
184 | .DS_Store
185 |
186 |
187 | #############
188 | ## Python
189 | #############
190 |
191 | *.py[cod]
192 |
193 | # Packages
194 | *.egg
195 | *.egg-info
196 | dist/
197 | build/
198 | eggs/
199 | parts/
200 | var/
201 | sdist/
202 | develop-eggs/
203 | .installed.cfg
204 |
205 | # Installer logs
206 | pip-log.txt
207 |
208 | # Unit test / coverage reports
209 | .coverage
210 | .tox
211 |
212 | #Translations
213 | *.mo
214 |
215 | #Mr Developer
216 | .mr.developer.cfg
217 |
--------------------------------------------------------------------------------
/000 - change the world.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/000 - change the world.jpg
--------------------------------------------------------------------------------
/000a - Data Science Stack.md:
--------------------------------------------------------------------------------
1 | Blog post on data science stack
--------------------------------------------------------------------------------
/000c - Medicare Blog Post.md:
--------------------------------------------------------------------------------
1 | Use medicare data
--------------------------------------------------------------------------------
/000d - SQL Regex.md:
--------------------------------------------------------------------------------
1 | Use regex
2 |
3 |
4 | https://connect.microsoft.com/SQLServer/feedback/details/261342/regex-functionality-in-pattern-matching
--------------------------------------------------------------------------------
/000e - Pesticides.md:
--------------------------------------------------------------------------------
1 | http://stackoverflow.com/questions/19611729/getting-google-spreadsheet-csv-into-a-pandas-dataframe
2 |
3 |
4 | http://vitals.lifehacker.com/why-you-shouldnt-buy-organic-based-on-the-dirty-dozen-1689190822
5 |
6 | http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3135239/
7 |
8 | Not all pesticides are the same
9 |
10 |
11 | * http://www.extremetech.com/extreme/218689-what-are-endocrine-disruptors-and-how-should-you-protect-yourself-from-them
12 | * http://well.blogs.nytimes.com/2014/12/08/bpa-in-cans-and-plastic-bottles-linked-to-quick-rise-in-blood-pressure/
13 | * http://arstechnica.com/tech-policy/2015/05/eu-dropped-plans-for-safer-pesticides-because-of-ttip-and-pressure-from-us/
14 | * http://www.theatlantic.com/health/archive/2015/02/the-food-babe-enemy-of-chemicals/385301/
15 | * http://theconversation.com/the-mercury-level-in-your-tuna-is-getting-higher-37147
16 | * http://world.openfoodfacts.org/ <- for side business
17 | * http://ajcn.nutrition.org/content/84/3/475.full.pdf
18 | * https://www.supertracker.usda.gov/default.aspx
19 | * http://www.minnpost.com/earth-journal/2014/11/arsenic-laden-rice-fda-deliberates-consumer-reports-issues-guidance
20 | * http://www.reuters.com/article/2015/03/20/us-monsanto-roundup-cancer-idUSKBN0MG2NY20150320
21 |
22 |
23 |
24 |
EWG's “Dirty Dozen” list of hormone-disrupting chemicals | @ewg @saferchem
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
--------------------------------------------------------------------------------
/000f - Picking a College Using Data.md:
--------------------------------------------------------------------------------
1 | ## Picking a Good College Using Data
2 |
3 | * Look at majors vs. schools.
4 | * School 'prestige'
5 | * School overhead
6 | * Total price - tuition, debt. Look at ROI
7 |
8 | * Look at PayScale data
9 | * Look at US World data, methodology
10 |
11 | - Value of school to prep people for a career. Value of school to prep people to be good citizens.
12 | - Try finding the amount schools spend on administration, other things by scraping Mechanical Turk?
13 | * http://fivethirtyeight.com/features/more-high-school-grads-decide-college-isnt-worth-it/
14 | * http://www.nakedcapitalism.com/2014/03/us-university-science-shopping-mall-model.html
15 | * http://seattletimes.com/html/businesstechnology/2023239544_apxwealthgapstudentloans.html
16 | * http://www.nytimes.com/2014/04/01/opinion/bruni-our-crazy-college-crossroads.html?src=me&ref=general
17 | * http://priceonomics.com/the-phd-deluge/
18 | * https://www.discover.com/student-loans/majors/index.html
19 | * http://mobile.nytimes.com/2014/06/29/upshot/americans-think-we-have-the-worlds-best-colleges-we-dont.html
20 |
21 |
22 | http://static.googleusercontent.com/media/www.google.com/en/us/googleblogs/pdfs/google_public_data_march2010.pdf
--------------------------------------------------------------------------------
/001 - Personal Security.md:
--------------------------------------------------------------------------------
1 | # Personal Security
2 |
3 | * Target, Home Depot, JP Morgan, you name it
4 |
5 | ### Core Lessons
6 |
7 | * The incentives for keeping data secure are all missing
8 | * Data's genius is also its Achilles heel: perfect copying
9 | * No way to prove that you are you that a hacker can't exploit
10 | * You will get hacked. The incentives are there.
11 |
12 |
13 |
14 | ### Make Yourself a Hard Target
15 |
16 | * Don't re-use passwords
17 | * Pick complicated passwords. Use pass phrases. Random generator. Password vault of some kind.
18 | * Two-factor auth
19 | * Fake identity questions
20 | * Change passwords over time
21 | * Credit freeze
22 | * Computer, smartphone security
23 | * Don't use services that get hacked.
24 | * Companies that pay their IT professionals low amounts.
25 | * Companies that have been hacked before.
26 | * Any place that limits the length of your password
27 | * Also goes for significant others, spouses, kids
28 | * Offer this as a service to friends, in trade.
29 |
30 | #### When You Get Hacked, Find Out Quickly
31 |
32 | * Credit monitoring services
33 | * Alerts
34 |
35 |
36 | #### When You Get Hacked, Have It Not Be a Huge Deal
37 |
38 | * Separation of security
39 | * Password reset emails
40 | * Identify single points of failure
41 | * Accounts
42 | * Locations
43 | * Devices
44 | * Companies that know enough about you that they could use it as leverage
45 | * Google
46 | * Facebook
47 | * Amazon
48 | * Anything that has your email or web-browsing habits
49 | * Banks
50 | * Medical companies
51 | * (DRAW A GRAPH, TYPICAL AND SAFE)
52 | *
53 |
54 |
55 | ### Pay for Good Ideas
56 |
57 | * One-time, limited-time debit/credit cards
58 | * Two-factor auth
59 | * Strong encryption
60 | * Anything that gives companies incentives to protect your data (higher liability, reputational risk, etc)
61 |
62 |
63 | ### Know Your Limits
64 |
65 | * Don't try protecting yourself from huge states (Mossad or not-Mossad).
66 | * Add XKCD comic.
67 | * Add James Mickens references.
68 |
69 |
70 | ### Not a Goal: Don't Be a Target
71 |
72 | * Keep a low profile (ummm)
73 | * Have nothing worth stealing (ummm)
74 | * Don't speak up about things you care about, that could make you a target (GamerGate, gun violence, inequality, racism, you name it)
--------------------------------------------------------------------------------
/002 - California Water.md:
--------------------------------------------------------------------------------
1 | # California Water Supply
2 |
3 | Scatterplot showing:
4 | - Sales price for farmers
5 | - Amount of water used per pound, or per calorie.
6 |
7 | http://www.nytimes.com/interactive/2015/05/21/us/your-contribution-to-the-california-drought.html
--------------------------------------------------------------------------------
/003 - spot pricing 3.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-11-15
4 | layout: post
5 | slug: spot-predictions
6 | title: Predicting AWS Spot Pricing
7 | meta-description:
8 | - aws
9 | - amazon web services
10 | - ec2
11 | - vm
12 | - cloud computing
13 | - race to zero
14 | - cost of computing
15 | - machine learning
16 | - prediction
17 | ---
18 |
19 | In the last two blog posts we covered the basics of AWS spot instances and looked at the landscape of cloud computing competitors.
20 |
21 | Now let's see if we can predict prices. Our goal is a better understand of how spot prics behave, so we can optimize our computing costs over time.
22 |
23 | #### Prices By Time
24 |
25 | Prices differ by time of day
26 |
27 | * Per biz hour / weekday.
28 | * Per time of day.
29 | * What days and times of day matter? What patterns exist?
30 | * Which times have the most price 'bursts'?
31 | * Which instance types have the most price 'bursts?'
32 |
33 | http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html
34 |
35 | http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/how-spot-instances-work.html
36 |
37 | http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-limits.html
38 |
39 | ----
40 |
41 | **Part 3: Predictions**
42 |
43 | * Show the results, not the algo
44 | * Which regions have the most price 'bursts'?
45 | * Algos to look at: logistic regression, others. Depends on pricing strategy
46 | * When do price spikes happen? Can they be predicted?
47 | * Do price spikes happen across AZs? Regions? Instance types?
48 | * 'Spike' defined as above some threshold, IQR, absolute value.
49 | * For a given price, X, how many hours will it last for each instance type?
50 | * Split per AZ, per region
51 | * Split by biz hour, weekday
52 | * how much does the price vary per instance? Inter-quartile range?
53 | * Are prices normally distributed? If not, why not?
54 | * What are the gaps in the analysis
55 | * Not clear how many instances exist of each type, especially for the specialized ones.
56 | * Number of hours before the price goes to X
57 | * Trying to do either regression (hours until price goes above X) or classification (odds that price will go away )
58 | * Do an independent prediction for each instance type
--------------------------------------------------------------------------------
/005 - spot pricing 2.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-11-20
4 | layout: post
5 | slug: cheap-computing-comparison
6 | title: Comparing Computing for Fun and Profit
7 | meta-description:
8 | - aws
9 | - amazon web services
10 | - ec2
11 | - spot instances
12 | - google compute engine
13 | - gce
14 | - microsoft azure
15 | - windows azure
16 | - vm
17 | - cloud computing
18 | - race to zero
19 | - cost of computing
20 | ---
21 |
22 | In my last blog post, I gave an introduction into Amazon Web Services' spot instances. There were some great deals to be found.
23 |
24 | Let's look at the competition. How cheaply can we run find computing resources using Google's [Compute Engine](https://cloud.google.com/compute/) and Microsoft's [Azure](http://azure.microsoft.com/en-us/)?
25 |
26 | First, let's compare the different instances types by both price and performance. For now, I'm going to assume that RAM speed is the same everywhere. It's only RAM capacity that matters.
27 |
28 | CPU speed, on the other hand, varies dramatically.
29 |
30 |
31 |
32 | **Azure**
33 |
34 | * http://azure.microsoft.com/en-us/pricing/details/virtual-machines/#Linux
35 | * Does it charge for local I/O? Does that even exist?
36 |
37 |
38 | **GCE**
39 |
40 | * Charges for local SSD I/O!
41 | * https://cloud.google.com/compute/docs/machine-types#standard
42 | * https://cloud.google.com/compute/docs/disks
43 | * https://cloud.google.com/compute/docs/local-ssd#pricing_and_quota
44 |
45 |
46 |
47 |
48 | https://aws.amazon.com/blogs/aws/focusing-on-spot-instances-lets-talk-about-best-practices/
49 |
50 |
51 |
52 | **Resources**
53 |
54 | * http://www.citeworld.com/article/2113976/cloud-computing/ultimate-cloud-speed-tests-amazon-vs-google-vs-windows-azure.html
55 | * http://blog.cloudharmony.com/2013/06/value-of-the-cloud-cpu-performance.html <- AMAZING
56 | * http://www.pythian.com/blog/comparing-cpu-throughput-of-azure-and-aws-ec2/
57 | * http://www.computerworld.com.au/article/539633/amazon_vs_google_vs_windows_azure_cloud_computing_speed_showdown/
58 | * http://sqlperformance.com/2014/05/io-subsystem/comparing-azure-vm-performance
59 | * https://cloudvertical.com/cloud-costs#cloud_costs/index
60 | * http://redmonk.com/sogrady/2014/11/18/iaas-pricing-patterns-1114/ <- VERY USEFUL
61 |
62 |
63 |
64 | #### Google Cloud Engine
65 |
66 | * Figure out CPU per scaling factor for each
67 | * How much of a discount is this compared to GCE or Azure, since they don't have this feature
68 |
69 | #### Microsoft Azure
70 |
71 | * Figure out CPU per scaling factor for each
72 | * How much of a discount is this compared to GCE or Azure, since they don't have this feature
73 |
74 |
75 |
76 | ### When In Doubt, Competition
77 |
78 | If I was a large company, I would use *several* cloud computing solutions. My reasoning is simple: it's cheaper that way.
79 |
80 | 'Public cloud' infrastructure is incredibly expensive to build and engineer. The leaders in the field have some of the smartest engineers on the planet. The barriers to entry are *extremely* high.
81 |
82 | When I'm a customer of companies that have natural barriers to competition, I want there to be lots of choices. As long as many different cloud-computing companies exist, there will be [competition on price, regardless of what people say](http://recode.net/2014/11/12/amazon-cloud-chief-andy-jassy-dismisses-talk-of-price-war/). Competition leads to lower prices than monopolies; that's Economics 101 (LINKME).
83 |
--------------------------------------------------------------------------------
/005 - spot pricing 4.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-11-16
4 | layout: post
5 | slug: spot-strategy
6 | title: AWS Spot Strategy
7 | meta-description:
8 | - aws
9 | - amazon web services
10 | - ec2
11 | - vm
12 | - cloud computing
13 | - race to zero
14 | - cost of computing
15 | - bidding strategies
16 | - cloud arbitrage
17 | ---
18 |
19 | Over the last few days I've looked at AWS spot instances, competitors, and predicting their performance. Now, let's look at how to use this information.
20 |
21 |
22 | Bidding strategies:
23 | * Maximum bid, keep it running
24 | * Persistent bid at a certan price, trade-off for cost vs. runtime
25 | * Auto-analyzing bid, move to different locations and exploit deals over time.
26 |
27 |
28 | Caveat. Sometimes there aren't very many of certain instance classes, so you can *bid against yourself*. Unfortunately there often isn't enough information about customer demand vs. supply to figure out what's going on (if you're being outbid by customers, or if AWS is reclaiming instances because it needs the supply for on-demand or other instance types).
29 |
30 |
31 | ----
32 | **Part 4: Strategy and Uses**
33 |
34 | * Public good. Science.
35 | * Offer to share
36 | * For a couple of prototypical workloads (use Netflix for an example), walk through the cost differential
37 | * Youtube video on strategies
38 | * http://santtu.iki.fi/2014/03/25/ec2-spot-price-minimum/
39 | * http://santtu.iki.fi/2014/03/20/ec2-spot-market/
40 | * http://santtu.iki.fi/2014/03/19/ec2-spot-usage/
41 | * http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-limits.html
42 |
43 | * http://blog.yhathq.com/posts/how-yhat-does-cloud-balancing.html
44 |
45 |
46 | **Post-Publish Notes***
47 |
48 | * Notify interested groups
49 | * Wendy Pastrick (?) at Seattle Cancer Care Alliance
50 | * Chris Bare at Sage Bionetworks
51 | * Susan ____ from SUUC at Harborview
52 | * UW-IT cloud computing folks. IAM.
53 | * eScience Institute
54 | * UW CSE department. My grad-student contact there.
55 | * UW Physics department
56 | * https://www.youtube.com/watch?v=mKElyNabc0A&feature=youtu.be
57 | * Kevin Jorissen, Research Assoc there.
58 | * Fernando Villa, Research scientist there
59 | * They focus on cluster instances
--------------------------------------------------------------------------------
/008 - Car Analysis Three.md:
--------------------------------------------------------------------------------
1 | # Car Analysis - Hunting for Deals
2 |
3 | This last may, I gave a presentation at the PASS Business Analytics conference on using data to make decisions. During that presentation I
4 |
5 | I've covered this topic before. I've purchased my own car using data (LINKS), compiled a list of car buying best practices (LINKS), and even purchased a cheap car in 72 hours for my sister (LINK).
6 |
7 | This time, I wanted answers to questions.
8 |
9 | * What patterns can I find in safe vs. unsafe cars?
10 | * What patterns can I find in cheap vs. expensive cars?
11 | * What are the best deals for a car nowadays?
12 |
13 |
14 | ### Context is King
15 |
16 | All behavior and patterns are influenced by the fundamental rules of their context. When buying a car, there are some obvious truths:
17 |
18 |
19 | * The goal of a car is to redu the time/effort it takes to get from one place to another. It therefore competes with bicycles, walking, buses, subways, planes, trains, and ZipCar, Car2Go, and Lyft* for convenience and value. (LINKS)
20 | * New cars are more expensive than used ones
21 | * Not all cars are created equal. They differ in features, quality, safety, and especially reliability.
22 | * However, cars of the same make and model will behave about the same, unless one of them has been damaged in some way.
23 | * Popular cars are more expensive than unpopular cars
24 |
25 |
26 | There are also some common truths. These are behaviors that happen *most* of the time, but not always:
27 |
28 | * Cars: the price of a car drops by 15-25% each year for the first 5 years.
29 | * The biggest expense is the car itself; it's not gas, or insurance, or repairs, it's the cost of purchasing the vehicle.
30 | * All of them wear down. They're machines. They have a finite lifespan; it's rare to hear of a car that lasts more than 300K miles or so, although cars with 200K miles on them are becoming fairly common.
31 | * It cheaper to add fancy features (backup cameras, fancy speakers) after buying it than when buying the car.
32 | * People normally drive around 12K miles a year. 200K mile car that's driven 12K miles a year will last around 16.6%. 5.9% a year.
33 | * Most car purchases are made within 50 miles of where the owner lives.
34 | * Car models undergo 'revisions'. A 2009 Toyota Prius and 2011 Toyota Prius don't look alike, because there were a bunch of changes made. Therefore, different model years for the same car will have different behaviors and safety.
35 |
36 |
37 | I'll add one more truth, a psychological one:
38 |
39 | **A car doesn't have to be an expression of your personality**. It can be just a box with an engine that gets you from one place to another.
40 |
41 | http://wolfstreet.com/2017/03/26/automakers-record-incentives-to-slow-decline-in-sales/
42 |
43 | * http://tradeinqualityindex.com <- HOLY CRAP, THIS IS AMAZING
44 | * http://www.mrmoneymustache.com/2011/09/30/is-a-costco-membership-worth-the-cost/
45 | * http://arstechnica.com/cars/2015/05/meta-analysis-finds-self-braking-cars-reduce-collisions-by-38-percent/
46 | * http://consumerist.com/2015/05/20/gm-that-car-you-bought-were-really-the-ones-who-own-it/
47 | * http://money.usnews.com/money/personal-finance/articles/2015/06/09/startups-offer-new-ways-to-buy-and-sell-used-cars
48 | * http://www.nytimes.com/2015/06/24/business/senate-commerce-hearing-takata-airbag-nhtsa-general-motors.html
49 | * https://medium.com/@ade3/the-zombie-mobile-b03932ac971d
50 | * http://wolfstreet.com/2016/11/22/strongest-pillar-of-the-shaky-us-economy-has-cracked/
51 | * https://www.nytimes.com/2017/01/27/your-money/used-cars-takata-recalls.html
52 | * https://www.yourmechanic.com/article/the-most-and-least-expensive-cars-to-maintain-by-maddy-martin
53 | * https://www.nytimes.com/2017/04/20/automobiles/wheels/new-cars-technology.html <- FOR CAR BLOG POST
54 |
55 | * https://publish.manheim.com/en/services/consulting/used-vehicle-value-index.html <- also VERY useful for used cars
56 |
57 |
58 | ## Safe and Unsafe Cars
59 |
60 | * Do analysis
61 |
62 |
63 | Right now you can buy over a thousand different car models. Some are brand-new, some are a bit older, but you can find all of them.
64 |
65 | That's a lot, so we have narrow down the field. Luckily, most of us can do this pretty easily.
66 |
67 | (DEMO) Car makes and models in Excel
68 |
69 | * Ensembling
70 | * Conditional formatting in Excel
71 | * Eliminating bad options vs. picking good ones
72 |
73 | **Ensembling** a.k.a model averaging or bagging.
74 |
75 | ***Big problem***: How do we choose what make and model of car to buy?
76 | * We know some reliable brands. Honda and Toyota are famous.
77 | * We ask our friends, family, neighbors, coworkers.
78 | * We rely on what has worked before.
79 |
80 | Anyone who followed the US presidential election in 2008 and 2012, this is what Nate Silver did to predict the outcome of all 50 states.
81 |
82 | * This problem comes up all the time, in politics, medicine, finance, even cooking. Nobody has all of the information and no bias, but *collectively* there's enough information and the bias can average out.
83 | * Simple way: find the ratings from major car sites and average them. This is more reliable than any single site alone.
84 |
85 | I did this for small cars & sedans last year.
86 |
87 | * When there is no single reliable source of data, use the aggregate of different sources.
88 | * Combining the ratings of 10 different car-review sites is more accurate than the ratings of any single site.
89 |
90 |
91 | ## Cheap and Expensive Cars
92 |
93 | * Do analysis. Cost per mile. Cost of the car. TCO.
94 | * What is the price difference if you want to carry more than 5 people?
95 | * What is the price difference if you want to haul things?
96 | * What is the price difference if you want to buy a 'luxury' car brand?
97 | * What is the price difference between new and used cars?
98 | * Can you find used luxury cars at the same price as new econoboxes?
99 |
100 | ## Car Deals
101 |
102 | Our question asks "what's a good deal". A 'deal' is one where there's high value for low cost. So we need to define cost, and value.
103 |
104 | Value, though, is harder to define. It's the value proposition of a car; it's transportation that saves time/energy compared to walking.
105 |
106 | One simple way is the number of miles it can take us before it dies.
107 |
108 | The ratio of cost:value is therefore the # of expected miles vs. its total cost of ownership.
109 |
110 | We can simplify this to $ per expected mile.
111 |
112 |
113 |
114 | (SWITCH TO TABLEAU, COST PER MILE, TCO PER MILE)
115 |
116 |
117 | * Eliminating bad options vs. picking good ones
118 | * Diminishing returns
119 | DEMO - ROI, diminishing returns, cost:value
120 |
121 |
122 | *Note: I did not mention Uber because of their recent behavior towards journalists. I don't believe companies that abuse their power should be given anything but scorn.*
123 |
124 |
125 |
126 | * http://www.nytimes.com/2015/03/27/automobiles/their-ranks-thinned-the-surviving-car-dealerships-thrive.html
127 | * http://www.nytimes.com/2015/03/31/business/dealbook/prosecutors-scrutinize-minorities-auto-loans.html
128 | * http://www.nytimes.com/2015/04/02/business/us-auto-sales-march.html
129 | * http://www.nytimes.com/2015/04/23/technology/personaltech/an-online-tune-up-for-the-used-car-marketplace.html
130 | * http://www.japantimes.co.jp/news/2014/04/07/business/gods-edging-out-robots-at-toyota-facility/
131 | * http://www.safetyresearch.net/blog/articles/toyota-unintended-acceleration-and-big-bowl-“spaghetti”-code
132 | * http://blog.instamotor.com/why-dealership-used-cars-cost-more/
133 |
134 |
--------------------------------------------------------------------------------
/008 - Car Prices.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/008 - Car Prices.xlsx
--------------------------------------------------------------------------------
/008 - Car Quality Stats.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/008 - Car Quality Stats.xlsx
--------------------------------------------------------------------------------
/008 - Car-Models.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/008 - Car-Models.xlsx
--------------------------------------------------------------------------------
/008 - Cars-For-Sale.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/008 - Cars-For-Sale.xlsx
--------------------------------------------------------------------------------
/010 - CameraAwesomePhoto (1).jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/010 - CameraAwesomePhoto (1).jpg
--------------------------------------------------------------------------------
/010 - CameraAwesomePhoto (2).jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/010 - CameraAwesomePhoto (2).jpg
--------------------------------------------------------------------------------
/010 - equality vs equity.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/010 - equality vs equity.jpg
--------------------------------------------------------------------------------
/010 - food lifecycle.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/010 - food lifecycle.jpg
--------------------------------------------------------------------------------
/010 - poor deserve the best.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/010 - poor deserve the best.png
--------------------------------------------------------------------------------
/010 - spider man.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/010 - spider man.jpg
--------------------------------------------------------------------------------
/015 - Tuition by Raw Numbers.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/015 - Tuition by Raw Numbers.xlsx
--------------------------------------------------------------------------------
/015 - UW Salary Info.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/015 - UW Salary Info.xlsx
--------------------------------------------------------------------------------
/015 - UW Student Tuition Plan.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/015 - UW Student Tuition Plan.pdf
--------------------------------------------------------------------------------
/015 - University Salary Analysis.md:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/015 - University Salary Analysis.md
--------------------------------------------------------------------------------
/016 - College from First Principles.md:
--------------------------------------------------------------------------------
1 | # College from First Principles
2 |
3 |
4 |
5 | ### How much should college cost?
6 |
7 | NOTE: this is a later blog post
8 |
9 | Let's look at colleges and universities from a student's perspective. What matters most of all is:
10 |
11 | * A degree they can show future employers
12 | * Good professors and teaching aides
13 | * A safe place to learn
14 |
15 | *Everything* else is optional.
16 |
17 | I'm going to shamelessly copy Elon Musk's idea of [analyzing cost from first principles](LINKME) from first principles.
18 |
19 | **Cost of a large university degree**
20 |
21 | University professors
22 | TAs / grad students
23 | Cost to rent a building in a suburb.
24 | Cost to rent a studio in a suburb
25 |
26 | **Cost of a small liberal-arts degree**
27 |
28 | University professors
29 | TAs / grad students
30 | Cost to rent a building in a sleepy 'college' town.
31 | Cost to rent a house and share it in a sleepy 'college' town.
32 | Better student:teacher ratio
33 |
34 |
35 | **Cost of a 2-year, 'intensive' degree**
36 |
37 | * Rise of coding schools
38 | * Smaller staff.
39 | * Cost to rent
40 |
41 |
42 | #### Externalities
43 |
44 | This is an unfair comparison. It discards a lot of what colleges pride themselves on: academic research, fancy dorms,
45 |
46 |
47 | Cost of higher ed from first principles:
48 |
49 |
50 |
51 | *Full disclosure: I am a staff member at the University of Washington. I acknowledge that this may cause some bias; I have tied to stick to the facts in an attempt to counter this.*
52 |
53 |
54 | #### Gender Balance
55 |
56 | Use gender-prediction API (name?) to figure out gender per name. Use that to find gender balance per title, per category, and overall. Also look at salary imbalance.
57 |
58 |
59 | http://data.spokesman.com/salaries/state/2014/306-university-of-washington/
60 |
61 | http://data.spokesman.com/salaries/state/faq/
62 |
63 | http://fiscal.wa.gov/Salaries.aspx <- salaries. Not total compensation.
64 |
--------------------------------------------------------------------------------
/020 - Industries.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/020 - Industries.pdf
--------------------------------------------------------------------------------
/020 - ally bingo.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/020 - ally bingo.jpg
--------------------------------------------------------------------------------
/025 - Why Ethics.md:
--------------------------------------------------------------------------------
1 | # Title
2 |
3 | * Ethics for Data Professionals
4 | * Ethics for Professionals
5 | * Professional Ethics: (SECTION)
6 |
7 |
8 | ## Sections
9 |
10 | 1. What is Ethical?
11 | 2. Why Ethics?
12 | * Is my job ethical?
13 | * Why Should I Care?
14 | 3. Looking for Ethics in All The Right Places
15 | * What industries are ethical?
16 | 4. Disrupting unethical industries.
17 |
18 | ## Ethics for Professionals: What and Why
19 |
20 | * "Why should I care?"*
21 | * "I like to work on X technology"*
22 |
23 |
24 | "If a man isn't proud of what he does, then he isn't proud of his living" - garbageworker from the 1960's Memphis Civil Rights movement and protests.
25 |
26 | Technical professionals have a lot of power. We only rarely consider the effects of our power and how our work is being used.
27 |
28 | (IMAGE: with great power comes great...you know)
29 |
30 |
31 | We all like to believe we are working to make our society, our world, a better place. At the very least we want to believe we aren't making things worse.
32 |
33 |
34 | #### You Didn't Build Yourself
35 |
36 | If you are reading this, then you're lucky enough to afford an Internet connection, which means you are most likely not starving, have a well-built home with heat, running water, and electricity. It also means you're probably a highly educated IT professional or software engineer making at least $40K a year, if not much more. That puts you in the ___% percentile of the population.
37 |
38 | You also didn't build yourself. Chances are you had family that sacrificed to raise you, a society that paid to educate you, civic services that collectively worked to keep your community safe, warm and intact. As a child we consume resources; we don't contribute back to society until we are older.
39 |
40 | That's fine. But saying someone is 'self-made' should mean they taught themselves without a teacher, without parents or family as guides, without police or firefighters to keep them safe. It didn't happen to you.
41 |
42 | Imagine if you were born to a family at the midpoint of our world's population. You wouldn't have the opportunity for an education. Your parents would make ___.
43 |
44 | We are the lucky ones. We won the lottery considering where we were born, where we grew up, and the part of society we were born into.
45 |
46 | "If not me, then who? If not now, then when?" (ATTRIBUTE QUOTE)
47 |
48 | The reason you should care about ethics is twofold:
49 |
50 | 1. You're extremely lucky; you've hit the circumstantial jackpot!
51 | 2. The planet is making those chances harder.
52 | 3. At some level, you like to think of yourself as a good person.
53 | 4. If data professionals don't start behaving ethically *en masse*, the data revolution will be worse for the average person than the Industrial Revolution (REFINE).
54 | 5. At some level you want to believe that that your compatriots and colleagues are working on the side of the angels.
55 |
56 |
57 | * http://www.bbc.com/future/story/20150130-the-man-who-studies-evil
58 | * http://www.ecouterre.com/reality-show-sends-fashion-bloggers-to-work-in-cambodian-sweatshop/
59 | * http://time.com/3694368/make-internet-better-place/
60 | * http://www.theguardian.com/commentisfree/oliver-burkeman-column/2015/feb/03/believing-that-life-is-fair-might-make-you-a-terrible-person?CMP=share_btn_fb
61 | * http://www.queerty.com/bayard-rustin-the-gay-dreamer-behind-dr-kings-i-have-a-dream-speech-20130828
62 |
63 | #### Is your job ethical? How do you know?
64 |
65 | That's great. Prove it.
66 |
67 | Engineers, researchers, and IT pros are taught to use data to justify our technical instincts. Writing tests, analyzing server logs, and collecting data for scientific experiments are all examples of our understanding that *we don't know as much as the data can tell us*.
68 |
69 | Let's apply that same principal to a different topic: ethics and work.
70 |
71 |
72 |
73 |
74 | We live in a place with massive inequality.
75 |
76 | * Start with a premise...we are created equal.
77 | * But our results are nowhere near equal, and it's not because of fate. It's partly, if not mostly, due to circumstances.
78 | * We live on a planet that's rapidly losing the ability to be habitable to us.
79 | * That's mostly for two reasons: we are consuming more per capita, and we are an overpopulated species.
80 | * Preventing unwanted pregnancies is a highly ethical thing to do.
81 | * Environment as a closed system. Optimizing under unknown constraints.
82 | * We live in a planet where exploiting other people is highly rewarded.
83 | * Most of us don't think this way. We think in terms of tools and problems. We remove the human results of our actions from the equation, because we're not comfortable with people.
84 | * Nobody likes to think of themselves as a bad person. So we find ways to justify our actions or explain them away.
85 | * You feel better knowing you're making the world a better place. It puts a spring in your step. You're glad when someone asks "Where do you work and what do you do?"
--------------------------------------------------------------------------------
/025 - ethics.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/025 - ethics.jpg
--------------------------------------------------------------------------------
/040 - ML companies.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/040 - ML companies.png
--------------------------------------------------------------------------------
/040 - NIST.SP.800-179.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/040 - NIST.SP.800-179.pdf
--------------------------------------------------------------------------------
/040 - data ethics slide.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/040 - data ethics slide.jpg
--------------------------------------------------------------------------------
/040 - ead_v1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/040 - ead_v1.pdf
--------------------------------------------------------------------------------
/040 - ethical tech.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/040 - ethical tech.jpg
--------------------------------------------------------------------------------
/040 - the scored society.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/040 - the scored society.pdf
--------------------------------------------------------------------------------
/050 - Privacy.markdown:
--------------------------------------------------------------------------------
1 | Attention Please: This is not a rant, but it is long. I'm going to keep this as tinfoil-hat free as possible.
2 |
3 | People thought we were kind of silly when we went on an anti-Google tirade a little while ago, and I admit I have relapsed into most of those services. With the recent news about PRISM, I'm now scrubbing my life of non-essential services (Facebook, Google, Microsoft, Skype, etc.), and moving anything "cloud" overseas to MEGA and the like. Expect to see less and less of us on Facebook, and for what we post here to become very generic and non-controversial.
4 |
5 | Albert Einstein said, "Never do anything against conscience, even if the state demands it." The seven companies now known to be complicit in the US government's unending quest to invade more and more of our lives either have no conscience, or disagree with this philosophy. Either way, they now stand counter to my own convictions. There are a number of alternative services out there that are either foreign-based and not beholden to US law, or are small enough that they are not on the radar. Use them. Vote with your choices in software, services, purchases, habits. If you can, use an SMS application that encrypts your texts (TextSecure on Android) and your voice communications (RedPhone on Android) and your IM communications (GibberBot on Android). Use a mail client that supports PGP. Use a firewall in your home. Use open-source, community-developed alternatives like FireFox and Pidgin instead of Chrome and Google Talk. Use OTR plugins to keep your communications secure. Use Tor.
6 |
7 | These little, "inconvenient" things can add up, and I promise you that once you adopt them and get used to them, you'll forget what you thought was so inconvenient. That is the blessing and the curse of the human mind: we adjust quickly to change when forced. So the same psychology that the NSA and FBI and DHS have used to cow us into accepting ever-increasing encroachment into our personal lives can in fact be used as a weapon against that encroachment, through the adoption of more secure practices.
8 |
9 | The time for laughing it off as a nutter conspiracy theory is over. What happened in Iran can happen here. What happened in East Germany can happen here. What happened in the former USSR can happen here. What happened in Chile can happen here. These were free societies, of different economic policy, that were overtaken by autocracy. The only way to prevent it is to fight it. The only way to fight it is to starve it.
10 |
11 | Each of us has a choice, and our choices matter.
12 |
13 |
14 | Google is like 2000s-era Microsoft. So pervasive it's impossible to get away. http://www.businessinsider.com/r-exclusive-google-aiming-to-go-straight-into-car-with-next-android---sources-2014-12
15 |
16 |
17 | * http://thomaslarock.com/2014/03/safe-data-theft/
18 | * http://www.fastcoexist.com/3027665/the-nsa-can-learn-all-your-secrets-from-your-phone-metadata
19 | * http://billmoyers.com/2014/03/13/tips-for-protecting-your-privacy-online/
20 | * http://us.macmillan.com/dragnetnation/JuliaAngwin/
21 | * http://technet.microsoft.com/library/cc722487.aspx
22 | * http://www.extremetech.com/internet/180485-the-ultimate-guide-to-staying-anonymous-and-protecting-your-privacy-online
23 | * http://blogs.computerworld.com/security/23805/michaels-finally-confirms-massive-pos-hack-aaron-bros-well
24 | * http://www.theguardian.com/world/2014/jun/07/stephen-fry-denounces-uk-government-edward-snowden-nsa-revelations
25 | * http://www.pewinternet.org/2014/12/18/other-resounding-themes/ <- "Privacy will become a luxury good"
--------------------------------------------------------------------------------
/050 - The Database of Ruin.md:
--------------------------------------------------------------------------------
1 | ## The Database of Ruin
2 |
3 | Privacy matters.
4 |
5 | The more valuable we are, the more likely data is to steal from us.
6 |
7 |
8 | * Black swans happen
9 | * How to measure likely impact.
10 | * Humans underestimate tail risk.
11 | * The riskier the behavior, the more you should default-to-safe.
12 | * Humans are risk averse.
13 | * Prototypes are great for this.
14 | * Change + Risk = Constant per culture (company, org, relationship).
15 | * Compatibility is huge.
16 |
17 |
18 | * http://boingboing.net/2014/03/03/full-nhs-hospital-records-uplo.html
19 | * http://www.technologyreview.com/photoessay/533426/the-troll-hunters/
20 | * http://betaboston.com/news/2014/03/05/a-vast-hidden-surveillance-network-runs-across-america-powered-by-the-repo-industry/
21 | * http://www.extremetech.com/computing/177945-how-big-business-builds-license-plate-databases-that-track-your-every-move
22 | * http://radar.oreilly.com/2014/03/the-creep-factor-how-to-think-about-big-data-and-privacy.html
23 | * http://flowingdata.com/2014/12/15/when-data-gets-creepy/
24 | * http://krebsonsecurity.com/2014/03/experian-lapse-allowed-id-theft-service-to-access-200-million-consumer-records/
25 | * http://gigaom.com/2014/03/13/with-data-brokers-selling-lists-of-alcoholics-to-big-business-the-feds-have-some-thinking-to-do/
26 | * http://mobile.nytimes.com/blogs/bits/2014/12/23/data-broker-is-charged-with-selling-consumers-financial-details-to-fraudsters/
--------------------------------------------------------------------------------
/050 - online tracking.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/050 - online tracking.jpg
--------------------------------------------------------------------------------
/060 - Precautionary Principle.md:
--------------------------------------------------------------------------------
1 | ## The Precautionary Principle
2 |
3 | * Black swans happen
4 | * How to measure likely impact.
5 | * Humans underestimate tail risk.
6 | * The riskier the behavior, the more you should default-to-safe.
7 | * Humans are risk averse.
8 | * Prototypes are great for this.
9 | * Change + Risk = Constant per culture (company, org, relationship).
10 | * Compatibility is huge.
11 |
12 |
13 | Risk vs. reward. Risk aversion.
14 |
15 | * http://www.bloomberg.com/news/2014-04-11/nsa-said-to-have-used-heartbleed-bug-exposing-consumers.html
16 | * http://www.wired.com/2014/04/hospital-equipment-vulnerable/
17 | * http://www.pqed.org/2014/06/how-should-people-respond-to-open-carry.html
18 | * http://www.economist.com/news/technology-quarterly/21615064-following-example-maker-communities-worldwide-hobbyists-keen-biology-have
19 | * http://arstechnica.com/science/2015/04/apollo-13-the-mistakes-the-explosion-and-six-hours-of-live-saving-decisions/
20 | * http://pando.com/2014/10/18/gms-hit-and-run-how-a-lawyer-mechanic-and-engineer-blew-the-lid-off-the-worst-auto-scandal-in-history/
21 | * http://www.wired.com/2012/10/ff-why-products-fail/all/
22 | * http://www.theatlantic.com/features/archive/2014/03/the-toxins-that-threaten-our-brains/284466/
23 | * http://www.nytimes.com/2015/04/10/opinion/why-pilots-still-matter.html?_r=0
--------------------------------------------------------------------------------
/070 - climate change viz.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/070 - climate change viz.jpg
--------------------------------------------------------------------------------
/070 - space ceded to cars.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/070 - space ceded to cars.jpg
--------------------------------------------------------------------------------
/080 - Music and Engineering.md:
--------------------------------------------------------------------------------
1 | # Music and Engineering
2 |
3 | * Both require creativity
4 | * Both have a 'frame' of constraints
5 |
6 | Musicians and Programmers - An Animusic Review - Music, Computers, Math & P
7 | http://www.digitalmusicnews.com/permalink/2014/06/23/fk-heres-entire-youtube-contract-indies
8 | Not Just for Music: Drumming Is Therapy, Too - The Daily Beast
9 | What keeps Henry Rollins productive? The musician/author/speaker/actor shar
10 | Why do we listen to our favourite music over and over again? Because repeat
11 | How Musicians Really Make Money in One Long Graph - Derek Thompson - The At
12 |
13 |
14 |
--------------------------------------------------------------------------------
/100 - Google Wagging the Dog.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-02-14
4 | layout: post
5 | slug: google-wag
6 | title: Google is wagging the whole Internet
7 | meta-description: In this blog post, Dev Nambi writes about the massive impact Google is having on all of software engineering.
8 | tags:
9 | - data science
10 | - signal vs. noise
11 | - fud
12 | - marketing
13 | - big data
14 | - machine learning
15 | - learning
16 | - distributed computing
17 | ---
18 |
19 | One of the more famous engineering diagrams is the OSI network model. It describes the different layers of a network, and what each layer is responsible for. It's a *beautiful* example of how separation of concerns and abstraction can be used to build a large system.
20 |
21 | It's the model that defines the engineering around the entire Internet.
22 |
23 | Data engineering doesn't have an equivalent model. This is my attempt to create one.
24 |
25 |
26 | * MapReduce
27 | * HDFS
28 | * BigQuery (HBase)
29 | * Dremel and Drill, Parquet. Columnar big data.
30 | * F1
31 | * Wagging the dog
32 | * omega and mesos
33 | * no-case servers and open compute
34 | * Entire industries around SEO
35 | Ajax in maps.
36 | Large storage in email inboxes.
37 |
38 | * http://the-paper-trail.org/blog/the-elephant-was-a-trojan-horse-on-the-death-of-map-reduce-at-google/
39 | * http://www.kdnuggets.com/2014/08/sibyl-google-system-large-scale-machine-learning.html <- Sibyl
40 | * http://www.slate.com/blogs/business_insider/2014/10/23/behind_the_scenes_look_at_google_data_centers.html
41 |
42 | ### The Elephant Has Left the Building
43 |
44 | Hadoop is almost a decade old. It's established. It's also showing it's age. The original MapReduce paper came out in 1999 (LINK).
45 |
46 | #### HDFS
47 |
48 | The HDFS file system is still immensely popular, even with companies that are working to 'replace Hadoop'. I have 2 guesses why:
49 |
50 | 1. It does a great job of maintaining file integrity using inexpensive disks without sacrificing performance.
51 | 2. Filesystems are hard to create.
52 |
53 | #### MapReduce
54 |
55 | MapReduce, on the other hand, hasn't aged as well. It works well for some problems, but it turns out to be very limiting for a lot of . In particular, its batch-oriented processing paradigm makes it useless for low-lateny (interactive) queries.
56 |
57 | Google replaced MapReduce with Dremel. Then it replaced that with F1.
58 |
59 | The most popular cutting-edge implementations of interactive big-data engines are probably Cloudera Impala and Apache Spark/Shark.
60 |
61 | ## The Layers
62 |
63 | Hardware, Infrastructure.
64 |
65 | Low-Level Operators
66 |
67 | High-Level Operators, Queries
68 |
69 | Algorithms, Parameters
70 |
71 | Languages
72 |
73 |
74 | #### Hardware
75 |
76 |
77 | #### Low-Level Operators
78 |
79 |
80 | #### High-Level Operators
81 |
82 |
83 | #### Algorithms, Parameters
84 |
85 |
86 | #### Languages
87 |
88 |
89 | ## The Future
90 |
91 |
--------------------------------------------------------------------------------
/100 - map-reduce funny.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/100 - map-reduce funny.png
--------------------------------------------------------------------------------
/110 - Big_Data_Landscape.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/110 - Big_Data_Landscape.png
--------------------------------------------------------------------------------
/110 - Data Stack.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-02-14
4 | layout: post
5 | slug: data-stack
6 | title: Data Stacks
7 | meta-description: In this blog post, Dev Nambi writes about the new data stacks.
8 | tags:
9 | - data science
10 | - signal vs. noise
11 | - fud
12 | - marketing
13 | - big data
14 | - machine learning
15 | - learning
16 | - distributed computing
17 | ---
18 |
19 | One of the more famous engineering diagrams is the OSI network model. It describes the different layers of a network, and what each layer is responsible for. It's a *beautiful* example of how separation of concerns and abstraction can be used to build a large system.
20 |
21 | It's the model that defines the engineering around the entire Internet.
22 |
23 | Data engineering doesn't have an equivalent model. It needs one as technology stacks, connectors, and processing models are invented, evolve, and die at a furious pace.
24 |
25 |
26 | http://radar.oreilly.com/2015/02/processing-frameworks-for-hadoop.html <- basically the article I wanted to write
27 |
28 |
29 | ### The Elephant Has Left the Building
30 |
31 | Hadoop is almost a decade old. It's established. It's also showing it's age. The original MapReduce paper came out in 1999 (LINK).
32 |
33 | #### HDFS
34 |
35 | The HDFS file system is still immensely popular, even with companies that are working to 'replace Hadoop'. I have 2 guesses why:
36 |
37 | 1. It does a great job of maintaining file integrity using inexpensive disks without sacrificing performance.
38 | 2. Filesystems are hard to create.
39 |
40 | * http://www.slideshare.net/julienledem/th-210pledem
41 | * http://tachyon-project.org/
42 |
43 | http://venturebeat.com/2014/05/11/the-state-of-big-data-in-2014-chart/
44 | http://azure.microsoft.com/en-us/documentation/articles/documentdb-sql-query/
45 |
46 | #### MapReduce
47 |
48 | MapReduce, on the other hand, hasn't aged as well. It works well for some problems, but it turns out to be very limiting for a lot of . In particular, its batch-oriented processing paradigm makes it useless for low-lateny (interactive) queries.
49 |
50 | Google replaced MapReduce with Dremel. Then it replaced that with F1.
51 |
52 | The most popular cutting-edge implementations of interactive big-data engines are probably Cloudera Impala and Apache Spark/Shark.
53 |
54 | ## The Layers
55 |
56 | **Figure Out Where They Fit**
57 |
58 | * Summingbird
59 | * MemSQL
60 |
61 | #### Hardware, Infrastructure.
62 |
63 | * Resource management
64 | * Process monitoring and restartability
65 | * ZooKeeper
66 | * Mesos
67 | * Cloud computing tools (Chef, Puppet)
68 |
69 |
70 | #### Storage, Filesystem, Memory
71 |
72 | * I/O (serialization, etc)
73 | * Connectors, connectors everywhere
74 | * Kafka
75 | * Apache Storm
76 | * HDFS
77 | * RDD in Spark
78 | * Hbase/Cassandra
79 | * Mongo
80 | * Tachyon
81 |
82 | #### Low-Level Operators
83 |
84 | * Pig
85 | * Connectors, connectors everywhere!
86 | * Hadoop operators (map, reduce)
87 | * Scala operators (map, flatmap, etc)
88 | * SQL operators (seek, scan, join, aggregate)
89 |
90 |
91 | #### High-Level Operators, Queries
92 |
93 | * ML
94 | * SQL (Hive, Shark)
95 | * Impala
96 | * Mahout
97 |
98 | #### Algorithms, Parameters
99 |
100 | * Brains
101 | * Hyperparameters
102 | * MLBase
103 |
104 | #### Languages
105 |
106 | * R
107 | * Python
108 | * Scala
109 | * Cascading
110 | * Java
111 | * .NET
112 | * C++
113 | * Cascalog
114 | * Clojure
115 |
116 |
117 |
118 | ## The Future
119 |
120 | * Consolidation
121 | * Go for simplicity (GraphLab)
122 | * Interoperability
123 |
124 |
--------------------------------------------------------------------------------
/110 - data stack hadoop cat.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/110 - data stack hadoop cat.png
--------------------------------------------------------------------------------
/110 - data stack.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/110 - data stack.jpg
--------------------------------------------------------------------------------
/115 - Cost of Complexity.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/115 - Cost of Complexity.jpg
--------------------------------------------------------------------------------
/115 - Systems Thinking.md:
--------------------------------------------------------------------------------
1 | # Systems Thinking
2 |
3 | * See the whole system and scenario, scale up and down in scope
4 | * NOC job as a basis for becoming a good developer
5 |
6 | https://medium.com/@joaomilho/festina-lente-e29070811b84
7 |
8 |
9 | ## Cost of Complexity
10 |
11 |
12 | Nassim N. Taleb (@nntaleb)
13 | 3/29/14, 9:50 AM
14 | General Principle: the solutions on balance needs to be simpler than the problems. (Otherwise the system collapses under its complexity)
15 |
16 | http://www.vanityfair.com/politics/2013/09/joint-strike-fighter-lockheed-martin.
17 |
18 | https://devopsu.com/blog/boring-systems-build-badass-businesses/
19 |
20 | http://www.vanityfair.com/business/2014/10/air-france-flight-447-crash
21 |
22 | http://firstround.com/article/The-one-cost-engineers-and-product-managers-dont-consider
--------------------------------------------------------------------------------
/120 - Cloud Computing incl Spot.md:
--------------------------------------------------------------------------------
1 | # Cloud Computing
2 |
3 |
4 | **Glacier**
5 | http://storagemojo.com/2014/04/25/amazons-glacier-secret-bdxl/
6 |
7 | What makes you think your tape drive is any better?
8 |
9 | Compression and network latency
10 |
11 |
12 | - Write PoSH cmdlet to upload files to AWS Glacier
13 |
14 |
15 |
16 | * http://recode.net/2014/11/12/amazon-cloud-chief-andy-jassy-dismisses-talk-of-price-war/
17 | * https://aws.amazon.com/blogs/aws/next-generation-of-dense-storage-instances-for-ec2/
18 | * http://www.slideshare.net/whiskybar/aws-ec2
19 | * http://www.salon.com/2014/11/13/amazons_dirty_energy_problem_is_about_to_get_even_worse/
20 | * http://www.theregister.co.uk/2014/11/10/kryders_law_of_ever_cheaper_storage_disproven/?mt=1415981641453
21 | * http://www.infoworld.com/article/2610403/cloud-computing/ultimate-cloud-speed-tests--amazon-vs--google-vs--windows-azure.html?page=4
22 |
23 |
24 | **How to pay for it?**
25 |
26 | * UW CSE
27 | * UW IT
28 | * Startups like Scalyr
29 | * Some other startup?
30 | * http://www.nouvola.com/
31 | * http://dataconomy.com/google-using-machine-learning-boost-efficiency-data-centres/
32 | * https://docs.google.com/a/google.com/viewer?url=www.google.com/about/datacenters/efficiency/internal/assets/machine-learning-applicationsfor-datacenter-optimization-finalv2.pdf
--------------------------------------------------------------------------------
/120 - cloud computing dev 1.JPG:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/120 - cloud computing dev 1.JPG
--------------------------------------------------------------------------------
/120 - cloud computing dev 2.JPG:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/120 - cloud computing dev 2.JPG
--------------------------------------------------------------------------------
/130 - Interdisciplinary.md:
--------------------------------------------------------------------------------
1 | ## Polyglot Engineering
2 |
3 |
4 | I have met many, many developers and IT engineers who think their job is something new. It's not. The practices involved in working with computers are new mixes of ancient techniques.
5 |
6 | I am inspired by Max Shron's excellent book, *Thinking with Data* (LINK). Max describes the linkages between data analysis and rhetoric, industrial design, and communication.
7 |
8 | ### Teams
9 |
10 | Very, very few software projects or systems are built by one person. The vast majority of the time a team is involved.
11 |
12 | Software engineering requires **psychology** and **social awareness**. The same traits you find in 8-year-olds when building a sand castle. Here are some of the "new" skills:
13 |
14 | * An awareness of people's emotions
15 | * Knowledge of how people communicate
16 | * Recognizing and developing
17 |
18 | Software engineering is **education**. You should always be learning; a new codebase, new techniques, new ideas. These build upon your current understand. You'll also be teaching your peers or new hires.
19 |
20 | * Teaching
21 | * The Socratic Method
22 | * Documentation is like preparing curriculum.
23 |
24 | Software engineering is **communication**. The ongoing effort to hear what people mean, not just what you hear. Asking precise, probing questions to get to the heart of the matter. Modulating your own communication style to match your audience.
25 |
26 |
27 | ### Self Awareness
28 |
29 | *You* build computer products. Your body, brain and reason are involved in the endeavor.
30 |
31 | Software engineering is about **self awareness**. Don't write emails when you're angry. Try to control your ego and hubris when writing code, so it doesn't become overly complicated.
32 |
33 | * Self-awareness. Religion. Meditation
34 |
35 |
36 | Software engineering is about **nutrition**. Eat healthy food so your brain's chemical pathways function well. Exercise so you have energy after you're done with your work.
37 |
38 |
39 | ### Science
40 |
41 | Software engineering borrows many, many things from science.
42 |
43 | Troubleshooting and debugging are the same as the scientific method.
44 |
45 | Designing a computer application is awfully similar to designing an experiment.
46 |
47 | Data analysis in science and engineering are *exactly the same*.
48 |
49 | Algorithms and data structures are applied math.
50 |
51 | Explaining your data analysis is like journalism.
52 |
53 | ### Craft
54 |
55 | My best lessons in engineering came when I learned carpentry from my grandfather. He posed challenges and let me solve them multiple times, in different ways. That was a great series of lessons in complexity, the sublime beauty of good design, and the balance required between form and function.
--------------------------------------------------------------------------------
/135 - Curiosity and Ego.md:
--------------------------------------------------------------------------------
1 | ## Curiosity and Ego
2 |
3 |
4 | * We want to know as much as possible
5 | * We want to learn things that will help us
6 | * We don't want to spend time on things we already know, or which we'll never use.
7 | * ROC curves
8 | * We don't want to learn things that are wrong.
9 | * Ego vs. 'meekness'.
10 | * Active learning and education
11 | * XKCD comic on learning Perl in high school
12 | * How to learn, and how to get better at something. What does science tell us?
13 | * Can curiosity be the right mental approach?
14 |
15 |
16 |
--------------------------------------------------------------------------------
/140 - Net Neutrality.md:
--------------------------------------------------------------------------------
1 | # Net Neutrality
2 |
3 | http://consumerist.com/2014/04/29/everything-you-need-to-know-before-e-mailing-the-fcc-about-net-neutrality/
4 |
5 | - Net neutrality idea - imagine if roads were like that
6 | • Traffic lights and speed limits changed depending on how much you paid
7 | • And on how much the other side paid.
8 | • It was also heavily subsidized by the government when first built.
9 | • Now we're auctioning off some sidewalks and tunnel rights.
10 | • Make it visual
11 |
12 |
13 |
--------------------------------------------------------------------------------
/150 - MonteCarlo.R:
--------------------------------------------------------------------------------
1 | # R learning script for Monte Carlo methods
2 | library(ggplot2)
3 | library(VGAM)
4 | theme_set(theme_bw())
5 |
6 |
7 | # blog post work
8 | set.seed(12345)
9 | req <- seq(1, 2250)
10 |
11 | qplot(x=req, y= (1 - ppareto(length(req), req, 0.138 ))*100, ylim=c(0,100), xlab="Number of Requests", ylab="% Complete")
12 |
13 |
14 | array(1 - ppareto(100, seq(1,100), 0.138 ))[20]
15 | plot(1 - ppareto(100, seq(1,100), 0.138 ))
16 | qplot(x=seq(1,100), y= 1 - ppareto(100, seq(1,100), 0.138 ), ylim=c(0,100))
17 |
18 | requests.df <- as.data.frame(cbind(req, 1 - ppareto(length(req), req, 0.138 )))
19 | names(requests.df) <- c("request_id","percent_complete")
20 |
21 | requests.df$
22 |
23 | head(requests.df)
24 |
25 |
--------------------------------------------------------------------------------
/150 - Project Paradox.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/150 - Project Paradox.png
--------------------------------------------------------------------------------
/150 - Scratch.R:
--------------------------------------------------------------------------------
1 | # R learning script for Monte Carlo methods
2 | library(ggplot2)
3 | library(VGAM)
4 |
5 | # http://www.youtube.com/watch?v=cpc9D0EVYSk
6 | # It's really about simulation. Monte Carlo looks at probablility patterns.
7 | # Sample and replicate are really useful.
8 | #
9 | #
10 | #
11 | #
12 | #
13 | #
14 | #
15 | #
16 | #
17 | #
18 | #
19 |
20 | # Simulate a game of chance with coin tossing
21 | options(width=60)
22 | sample(c(-1,1), size=50, replace=TRUE) #sample from a given set.
23 | #Replace the values after you sample them
24 |
25 | # cumsum will do a rolling cumulative sum. That's handy.
26 |
27 | win <- sample(c(-1,1), size=50, replace=TRUE)
28 | cum.win <- cumsum(win)
29 | cum.win
30 |
31 |
32 | #extend, plot the sequence of cumulative winnings for 4 games
33 | par(mfrow=c(2,2)) #carves up the graphic frames into 4 pieces
34 | for (j in 1:4) {
35 | win <- sample(c(-1,1), size=50, replace=TRUE)
36 | plot(cumsum(win), type="l", ylim=c(-15,15))
37 | abline(h=0)
38 | }
39 |
40 | #there's a lot of variability here. Interesting
41 | # what do we see? There's a level at which you should declare victory and stop
42 | # pick a random set.seed, set it to a large number
43 |
44 | # 1) what's the probability of breaking even after 50 games?
45 | # 2) what likely number of tosses that Peter will be winning?
46 | # 3) what's the value of Peter's best fortune?
47 |
48 | # simulate the random process once, and then repeat.
49 | # compute statistics as you repeat.
50 | #
51 |
52 | # user-defined function
53 | peter.paul <- function(n=50) {
54 | win <- sample(c(-1,1), size=n, replace=TRUE)
55 | sum(win) #fortune at the end of the game
56 | }
57 |
58 | peter.paul()
59 |
60 | F <- replicate(10000, peter.paul()) #calls a function a bunch of times
61 | #has a 1000 values
62 | max(F) #highest value
63 |
64 | table(F) #frequency binning
65 | par(mfrow=c(1,1))
66 | plot(table(F)) #looks normally distributed
67 | # no odd numbers. Why is that? Doesn't go into it.
68 | # Only way to break even is if the head comes up n/2 times
69 |
70 | ## what are the chances he breaks even? It's the ratio of him finishing with 0 out of 1000
71 | # In the simulation it was 1119/10000, or .1119
72 | dbinom(25,size=50,prob=0.5) #comes out to be .112. That's really close
73 |
74 | #now on part 3
75 |
76 |
77 |
78 |
79 |
80 |
81 | # blog post work
82 | set.seed(12345)
83 | req <- seq(1, 2250)
84 |
85 | plot(ppareto(seq(1,100),1,0.548))
86 |
87 | array(1 - ppareto(100, seq(1,100), 0.138 ))[20]
88 | plot(1 - ppareto(100, seq(1,100), 0.138 ))
89 | qplot(x=seq(1,100), y= 1 - ppareto(100, seq(1,100), 0.138 ), ylim=c(0,1))
90 | qplot(x=seq(1,2250), y= 1 - ppareto(2250, seq(1,2250), 0.138 ), ylim=c(0,1))
91 |
92 | cdf_pareto <- function(length, location)
93 | {
94 | sum(dpareto(seq(1,length),location, shape=1))
95 | }
96 |
97 | req.df <- as.data.frame(req)
98 | names(req) <- c('request')
99 | req.df$pareto <- NULL
100 | req.df$pareto <- cdf_pareto(2250,
101 |
102 | qplot(data=req.df, x=req, y=pareto)
103 |
104 |
105 | alpha <- 3;
106 | k <- exp(1);
107 | x <- seq(2.8, 8, len = 300)
108 | plot(x, dpareto(x, location = alpha, shape = k))
109 | qpareto(seq(0.1,0.9,by = 0.1),location = alpha,shape = k)
110 |
--------------------------------------------------------------------------------
/151 - Project Planning 2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Project Planning and Prioritization
3 | layout: post
4 | slug: project-expansion
5 | --
6 |
7 | **Phases**
8 |
9 | 1. Assume customers of equal size, with equal request sizes. Assume all work is the same size. No contravening work. PMs of 1, 2, 3, up to 10 layers away. Add margins of error.
10 | - To add a feature that isn't requested, how much clearer of vision do you need to have?
11 | - What are the implications?
12 | - What are the assumptions of guessing. How to reduce that risk. Sampling.
13 | - Since risk is roughly proportional to guesstimate size, assume random risk up to 2X the size of the estimate. See what happens.
14 | - Punish misses 5X more than early successes. Human psychology for underpromising and overdelivering.
15 | 2. Assume customers of unequal size, with unequal request sizes. Repeat
16 | 3. Assume work is unequal in size, with unequal risk and impact. Repeat.
17 | - Impact if we don't take on too small or too-large work. Why. Repeat.
18 | 4. Play with different assumptions. Cynical. Idealistic. See what happens.
19 |
20 | Assume customers of unequal size, with unequal request sizes. Repeat
21 |
22 | - To add a feature that isn't requested, how much clearer of vision do you need to have?
23 | - What are the implications?
24 | - What are the assumptions of guessing. How to reduce that risk. Sampling.
25 | - Since risk is roughly proportional to guesstimate size, assume random risk up to 2X the size of the estimate. See what happens.
26 | - Punish misses 5X more than early successes. Human psychology for underpromising and overdelivering.
27 | - Consider impact of codebase dilution. Hiring more people and its inefficiencies.
--------------------------------------------------------------------------------
/152 - Project Planning 3.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Project Planning and Prioritization
3 | layout: post
4 | slug: project-sizes
5 | --
6 |
7 | **Phases**
8 |
9 | 1. Assume customers of equal size, with equal request sizes. Assume all work is the same size. No contravening work. PMs of 1, 2, 3, up to 10 layers away. Add margins of error.
10 | - To add a feature that isn't requested, how much clearer of vision do you need to have?
11 | - What are the implications?
12 | - What are the assumptions of guessing. How to reduce that risk. Sampling.
13 | - Since risk is roughly proportional to guesstimate size, assume random risk up to 2X the size of the estimate. See what happens.
14 | - Punish misses 5X more than early successes. Human psychology for underpromising and overdelivering.
15 | 2. Assume customers of unequal size, with unequal request sizes. Repeat
16 | 3. Assume work is unequal in size, with unequal risk and impact. Repeat.
17 | - Impact if we don't take on too small or too-large work. Why. Repeat.
18 | 4. Play with different assumptions. Cynical. Idealistic. See what happens.
19 |
20 | 3. Assume work is unequal in size, with unequal risk and impact. Repeat.
21 | - Impact if we don't take on too small or too-large work. Why. Repeat.
22 |
23 | - To add a feature that isn't requested, how much clearer of vision do you need to have?
24 | - What are the implications?
25 | - What are the assumptions of guessing. How to reduce that risk. Sampling.
26 | - Since risk is roughly proportional to guesstimate size, assume random risk up to 2X the size of the estimate. See what happens.
27 | - Punish misses 5X more than early successes. Human psychology for underpromising and overdelivering.
28 |
29 | Size estimates are notoriously inaccurate, often wrong by an order of magnitude or more.
30 |
31 | Size estimates:
32 |
33 | * Takes 1 person 2 days = 2 days
34 | * Takes 5 people 2 weeks = 10 days
35 | * Takes 10 people 10 weeks = 100 days
36 |
37 | Software estimates are usually logarithmic; making something bigger makes it 10X bigger, not 2X bigger. Also, the amount of risk increases proportionally.
38 |
39 | ### Questions
40 |
41 | * What happens when size estimates are unequal? How does risk play out?
42 | * Think about how iteration reduces risk. Shifting baselines.
--------------------------------------------------------------------------------
/153 - Project Planning 4.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: Project Planning and Prioritization
3 | layout: post
4 | slug: project-assumptions
5 | ---
6 |
7 | **Phases**
8 |
9 | 1. Assume customers of equal size, with equal request sizes. Assume all work is the same size. No contravening work. PMs of 1, 2, 3, up to 10 layers away. Add margins of error.
10 | - To add a feature that isn't requested, how much clearer of vision do you need to have?
11 | - What are the implications?
12 | - What are the assumptions of guessing. How to reduce that risk. Sampling.
13 | - Since risk is roughly proportional to guesstimate size, assume random risk up to 2X the size of the estimate. See what happens.
14 | - Punish misses 5X more than early successes. Human psychology for underpromising and overdelivering.
15 | 2. Assume customers of unequal size, with unequal request sizes. Repeat
16 | 3. Assume work is unequal in size, with unequal risk and impact. Repeat.
17 | - Impact if we don't take on too small or too-large work. Why. Repeat.
18 | 4. Play with different assumptions. Cynical. Idealistic. See what happens.
19 |
20 | 4. Play with different assumptions. Cynical. Idealistic. See what happens.
21 |
--------------------------------------------------------------------------------
/160 - Communication and Storytelling.md:
--------------------------------------------------------------------------------
1 | # Communication
2 |
3 | **Reject the premise of an argument**
4 |
5 | * Context is king
6 | * Underlying assumption
7 | * Tone, message.
8 |
9 |
10 | ### Writing
11 |
12 | http://bighow.com/news/the-art-of-great-writing-60-writing-tips-from-6-alltime-great-writers
13 |
14 | Funny pictures and quotes.
15 | Math equations.
16 | Drafts
17 | Check spelling.
18 | Read out loud. Edit down.
19 | Remove all big words.
20 | Images - http://designrope.com/design/find-stock-photos-dont-suck/
21 | Only use active verbs.
22 | - Make a blog post checklist
23 | § Has it been revised?
24 | § Has it been spell checked?
25 | § Has it been checked for grammar errors?
26 | § If it talks about a feature or example, did you mention the SQL version you're using?
27 | § Check against grammar and style books
28 | □ Strunk and white has impeccable style
29 | - "Write until you're absolutely in love with the work"
30 |
31 | * http://seriouspony.com/blog/2013/10/4/presentation-skills-considered-harmful
32 | * http://mobile.nytimes.com/2015/02/14/world/europe/russian-tv-insider-says-putin-is-running-the-show-in-ukraine.html?_r=1&referrer=
33 | * http://www.bakadesuyo.com/2014/12/how-to-read-people/
34 | * http://ozar.me/2015/02/best-presentations-based-pain/
35 | * https://tractionloops.com/web-property-systems/
36 |
37 |
38 | » Speaking: Entertain, Don’t Teach hilarymason.com
39 | 8 Conversational Habits That Kill Credibility | Inc.com
40 |
41 |
42 | * https://www.khanacademy.org/partner-content/pixar/storytelling
43 | * https://hynek.me/articles/speaking/
44 |
45 | * http://www.fastcodesign.com/3038950/evidence/the-science-of-politely-ending-a-conversation
46 | * http://www.theatlantic.com/education/archive/2014/12/how-scientists-are-learning-to-write/383685/?single_page=true
47 | * http://www.bobpusateri.com/archive/2015/02/why-you-should-submit-for-pass-summit-2015/
48 | * http://www.bbc.com/future/story/20150324-the-hidden-tricks-of-persuasion
49 | * http://www.artofmanliness.com/2012/08/22/how-to-make-small-talk/
50 | * http://www.washingtonpost.com/posteverything/wp/2015/05/26/powerpoint-should-be-banned-this-powerpoint-presentation-explains-why/
51 | * http://paulgraham.com/talk.html
52 | * http://qz.com/778767/to-tell-someone-theyre-wrong-first-tell-them-how-theyre-right/
53 | * http://blog.statuspage.io/why-public-apologies-suck
54 | * https://hackernoon.com/pr-101-for-engineers-7cd116cc5347
55 | * https://longreads.com/2017/04/12/the-elements-of-bureaucratic-style/
56 | * http://andrewchen.co/professional-blogging/
57 |
58 |
59 | Blog meta
60 | Who is my audience?
61 | DBAs who want to have a better relationship with their developers
62 | Developers who want to have a better relationship with their DBAs
63 | A DBA with 2+ years of experience, 1 or more dev teams to support, and friction
64 | A developer (DBE or not) with 1+ year of SQL Server development experience, who has a hard time working with their DBA(s).
65 | Write down one concept a minute for twenty minutes
66 | Then take each concept and write something about each for two minutes. Write two sentences about each. If you can't write two sentences, delete it. If it’s particularly juice, note that and move on.
67 | Minimum length - go for as long as you can.
68 | This increases the chances that someone big will link to it, and your traffic will explode
69 | Use SnagIt for screen capture
70 | Look over Google Analytics to figure out how to improve blog posts
71 | Why did post A get 50% more hits than post B?
72 | - What caused me the biggest pain?
73 | § This will always create new topics
74 | - Blog your life, challenges and improvements
75 | § Tactics I use to get through the day
76 | - Often limited to 1/2 to 2 pages
77 | - Blog posts: fairly condensed, deals with a specific topic, and it is transitory
78 | - Why do we write?
79 | § It helps us become a better researcher
80 | § It helps us learn
81 | § If we write properly, it helps us organize our thoughts
82 | - How often should I write?
83 | § 1-2 times a week is a good starting point
84 |
--------------------------------------------------------------------------------
/160 - linkbait effectiveness.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/160 - linkbait effectiveness.png
--------------------------------------------------------------------------------
/170 - Getting Started With Programmind.md:
--------------------------------------------------------------------------------
1 | To get started, I'd recommend learning Python. It's one of the easiest languages to learn, and it's also one of the most widely used (http://redmonk.com/sogrady/2014/06/13/language-rankings-6-14/). It's also an open-source language, so you can install it anywhere. I'd recommend installing Anaconda (http://continuum.io/downloads), which takes care of a lot of the version-incompatibility headaches you can run into with open-source tools.
2 |
3 | There are some decent online sites that walk you through how to learn Python, notably Codeacademy (http://www.codecademy.com/en/tracks/python). I'd also recommend a couple of books, (http://learnpythonthehardway.org/, http://www.amazon.com/Python-Cookbook-Alex-Martelli/dp/0596007973). I'd start with the online sites to learn the basics, and then progress up to the books.
4 |
5 | I'd also recommend signing up for a GitHub account (https://github.com/) and putting all of your code and projects there. GitHub accounts are free-and-open-source, and a lot of the most popular open-source tools are there (like Linux). If you work on projects regularly and improve, your GitHub account becomes a pretty compelling resume.
6 |
7 | It's going to take time, though, and a lot of patience. For me it was an endless series of evenings and weekends spent tinkering. One of my favorite teachers, Hilary Mason (http://www.hilarymason.com/), said that computer science is the endless process of playing with your curiosity and cleverness to find your way around an endless series of brick walls (http://www.hilarymason.com/presentations-2/devs-love-bacon-everything-you-need-to-know-about-machine-learning-in-30-minutes-or-less/ ). Many people run out of patience; it's probably the most common reason people stop learning to write code.
8 |
9 | The best way I've heard of to fight the disillusionment problem is to use code to work on a problem you're interested in, so it's not as abstract. For Sean, that could be playing with the music/audio utilities in Python (https://wiki.python.org/moin/PythonInMusic), or looking at stuff about Portland (food, housing, weather, etc). There are also local meetups that discuss programming (http://www.meetup.com/pdxpython/ and many others); those can be a lot of fun, and they're amazing places to learn.
10 |
11 | I hope this helps.
12 | Cheers!
13 | Dev
--------------------------------------------------------------------------------
/180 - Incentives.md:
--------------------------------------------------------------------------------
1 | # Incentives
2 |
3 |
4 |
5 | ## We Don't Appreciate What Works Well
6 |
7 | * Cultural bias
8 | * "The squeaky wheel gets the grease"
9 | * What effects does this lead to
10 | * "Bad cases make for bad law"
11 | * Post - undervaluing preventative work, overvaluing heroic fixes
12 | * http://finance.yahoo.com/blogs/breakout/target-s-pr-nightmare-continues-160404828.html
13 |
14 | ## On Influence
15 |
16 | ### The Carrot
17 |
18 | ### The Stick
19 |
20 | ### Trading Favors
21 |
22 | ### Why It Matters
23 |
24 | * Different perspectives
25 |
26 |
27 | **Metrics and Unintended Consequences**
28 |
29 | Schools - http://tpep-wa.org/student-growth-overview/student-growth-case-studies/
30 |
31 | Altruism - http://www.theatlantic.com/education/archive/2014/06/most-kids-believe-that-achievement-trumps-empathy/373378/
32 |
--------------------------------------------------------------------------------
/180 - incentives.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/180 - incentives.jpg
--------------------------------------------------------------------------------
/190 - Project Animal Names.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/190 - Project Animal Names.jpg
--------------------------------------------------------------------------------
/190 - Project Names 2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/190 - Project Names 2.jpg
--------------------------------------------------------------------------------
/190 - Project Names.md:
--------------------------------------------------------------------------------
1 | ## Technology Names and FUD
2 |
3 | All technology products have names. While I try not to judge a book by its cover, I have found that some products can be identified as bad by their names alone.
4 |
5 |
6 | ### What's In a Name?
7 |
8 | What does a project or product name tell us? At the most basic level, they exist for easy identification. MySQL and PostgreSQL aren't named 'RDBMS 1' and 'RDBMS 2' for this reason.
9 |
10 | At a second level, they exist for *branding*. This is where the wheels fall off, because brands have contradictory goals:
11 |
12 | * They should be memorable
13 | * They should be accessible
14 | * They should describe the product
15 |
16 | What I have found is that product/project names can tell you a lot about the culture of the organization creating them.
17 |
18 | ### First There was The Sale
19 |
20 | When we are looking for a product to purchase, we need to know several things:
21 |
22 | * What the product does
23 | * How the product interrelates with *other* products
24 | * Where it comes from
25 | * Does it solve our business problem?
26 | * Does the company creating it
27 | * The cost
28 | * Supportability
29 | * Unambiguous.
30 |
31 | There is *no way* you can fit that into a single sentence, let alone a phrase or name.
32 |
33 | I am sick and tired of products that are "branded" to appeal to sales people. Why? Projects named by marketing people are inevitably targeted towards business executives, CIOs, CTOs, and managers.
34 |
35 |
36 | https://twitter.com/mrogati/status/395666192842510336
37 |
38 | Here are some names of technology projects/services that are given sales-y names:
39 |
40 | * Windows Azure
41 | * Windows
42 | * Office
43 | * Access
44 | * Word
45 | * Excel
46 | * Exchange
47 | * PowerPoint
48 | * Power View
49 | * Power Query
50 | * Power Pivot
51 | * Power Shell
52 | * Power BI
53 | * In-Memory OLTP
54 | * Q & A
55 | * Windows Azure SQL Database
56 | * SAP HANA
57 | * SalesForce
58 | * Sugar CRM
59 | * Tableau
60 |
61 |
62 |
63 | ### Meanwhile, IRL
64 |
65 | When we use a product day in and day out, we have different requirements:
66 |
67 | * Easy to pronounce
68 | * Unambiguous (both as a project and in normal language)
69 |
70 | ...and that's it.
71 |
72 |
73 | ### Age of the Geek
74 |
75 | I am a huge fan of unusual project names, because unusual names tells me critical:
76 |
77 | **The product is about substance, and not appearance**
78 |
79 | I have seen far, far too many sales pitches for products that look great and never work correctly. I have a simple theory:
80 |
81 | * There's never enough engineering talent to go around.
82 | * One of the big limitations of any organization is the number of people who *detract* from the work that really matters
83 | * Managers (who aren't visionaries or practical)
84 | * PMs (who aren't customer advocates)
85 | * Salespeople / Marketing (who care more about sales than the product)
86 | * Legal (who are worried about potential liability)
87 | * Anybody who works primarily via email
88 | * Anybody who thinks in purely theoretical concerns.
89 | * When engineers pick names, they have influence in the company
90 | * When marketers pick names, they have influence in the company
91 |
92 | Why? Because they are about function, and not appearance.
93 |
94 | * What can it do?
95 | * What are its capabilities?
96 | * How does it work?
97 | * What new features are involved?
98 | * How do I adopt it?
99 | * What are the limitations?
100 | * How well does it compare to competing options?
101 |
102 |
103 |
104 | Hadoop is not only the elephant in the room, *it's the name of a stuffed elephant*
105 |
106 | Here are some names of technology projects that are given unusual names:
107 |
108 | * Hadoop
109 | * Hive
110 | * F1
111 | * Spanner
112 | * Dremel
113 | * Mahout
114 | * Spark
115 | * Shark
116 | * Apollo
117 | * Red Dog
118 | * MLBase / MLlib
119 | * DeepDive
120 | * Sandy Bridge
121 | * Ivy Bridge
122 | * Bay Trail
123 | * Solr
124 | * Lucene
125 | * Gump
126 | * Git
127 | * Linux
128 | * Pig
129 | * Impala
130 | * ZooKeeper
131 | * Tomcat
132 | * Python
133 | * R
134 | * Cassandra
135 | * CouchDB
136 |
137 |
138 | (ADD A TABLE )
139 |
140 | (ADD PET IMAGE: I think that engineers either spend too much, or too little time with their pets)
141 |
142 | I'm also a huge fan of technology project code names, because they're usually chosen by technical people. Marketing people don't pick names like this.
143 |
144 | Why does this matter?
145 |
146 | I want to know about the signal:noise ratio in a product.
147 |
148 | ### Signal
149 |
150 |
151 | ### Noise
152 |
153 | * Does it have all ___ list of features I don't care about?
154 | * Is it 'enterprise' ready?
155 |
156 |
157 | **Noisy Terms*
158 |
159 | * AlwaysOn (replaces Hadron)
160 | * PowerQuery
161 | * PowerView
162 | * In-Memory Index
163 | * Columnstore Index (to replace Apollo)
164 | * In-Memory OLTP (to replace Hekaton)
165 | * Windows Azure SQL Database
166 | * Windows Azure (to replace Red Dog)
167 | * Elastic Compute Cloud (EC2)
168 | * Simple Storage Service (S3)
169 | * Word
170 | * Excel
171 | * Access
172 | * Office
173 | * Office 365
174 | * '3rd generation i7' (to replace Ivy Bridge)
175 |
176 |
177 |
178 |
179 |
180 | Pig.
181 | Hadoop
182 | YARN
183 | Mahout
184 | Impala
185 | Hive
186 | Linux
187 | S3
188 | EC2
189 | Dremel. Drill
190 |
191 | http://www.theatlantic.com/features/archive/2014/04/the-origins-of-office-speak/361135/
192 |
193 | vs.
194 | RAC
195 | AlwaysOn
196 | SQL Server
197 | Windows
198 | Office
199 | SSAS
200 | SSRS
201 | PowerView
202 | PowerPivot
203 | Tableau
204 | Azure
205 | BigQuery
--------------------------------------------------------------------------------
/200 - data viz.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/200 - data viz.jpg
--------------------------------------------------------------------------------
/200 - data viz.md:
--------------------------------------------------------------------------------
1 | # Data Visualization
2 |
3 | * Categorical
4 | * Numeric
5 | * Optimizing for human psychology
6 |
7 | Scatterplots, bar charts, etc.
8 | Include guidance.
9 | Data Visualization
10 | Difficulty vs comprehension.
11 | Writing your own query
12 | Understanding the data model
13 | Picking a good graphic
14 | Understanding the graphic
15 | Asking a different question.
16 | Repeat
17 | Time involved.
18 |
--------------------------------------------------------------------------------
/2013-12-08-pagerank scale.md:
--------------------------------------------------------------------------------
1 | ---
2 | slug: pagerank
3 | title: Scaling PageRank in SQL
4 | layout: post
5 | author: Dev Nambi
6 | date: 2013-12-08
7 | meta-description: In this blog post Dev Nambi analyzes how PageRank scales in SQL.
8 | tags:
9 | - sql
10 | - sql development
11 | - PageRank
12 | - graph databases
13 | - scaling
14 | ---
15 |
16 | In [my last post](http://devnambi.com/2013/pagerank/) I put together an implementation of [PageRank](http://en.wikipedia.org/wiki/PageRank) using SQL. Now let's see how it scales.
17 |
18 | #### Tables
19 |
20 | I'll be using the same tables as before, **Nodes** and **Edges**
21 |
22 | {% highlight SQL %}
23 | CREATE TABLE Nodes
24 | (NodeId int not null
25 | ,NodeWeight decimal(10,5) not null
26 | ,NodeCount int not null default(0)
27 | ,HasConverged bit not null default(0)
28 | ,constraint NodesPK primary key clustered (NodeId)
29 | )
30 |
31 | CREATE TABLE Edges
32 | (SourceNodeId int not null
33 | ,TargetNodeId int not null
34 | ,constraint EdgesPK primary key clustered (SourceNodeId, TargetNodeId)
35 | ,constraint EdgeChk check SourceNodeId <> TargetNodeId --ignore self references
36 | )
37 | {% endhighlight %}
38 |
39 |
40 | #### Table Setup
41 |
42 | To run these tests I have my home workstation, a bog-standard Core i5-2500K CPU, 16GB of RAM, and a 1TB 7200pm SATA drive that I'll be using for both tempdb and the PageRank test database.
43 |
44 | Whenever I run a test, I want to measure a few key metrics:
45 |
46 | * CPU time
47 | * Clock time
48 | * Logical I/O (memory accesses)
49 | * Physical I/O (reads and writes)
50 | * Number of iterations needed to converge
51 | * Number of nodes that converge each iteration
52 |
53 | In addition, I want to start small and scale up my tests. I'll be running several tests:
54 |
55 | * 10 nodes, 15 edges
56 | * 100 nodes, 175 edges
57 | * 1K nodes, 3K edges
58 | * 10K nodes, 50K edges
59 | * 100K nodes, 750K edges
60 | * 1 mil nodes, 10 mil edges
61 | * 10 mil nodes, 100 mil edges
62 | * 100 mil edges, 1 billion edges
63 | * 1 billion edges, 10 billion edges
64 |
65 | (LIST THE TESTS)
66 |
67 | I'm tweaking the tests by adding a few performance tweaks:
68 |
69 | * Adding a [columnstore](ADD LINK) (columnar) index to the Edges table, since it is read-only.
70 |
71 | #### Results
72 |
73 | **TO DO**
74 |
75 | * CPU scaling
76 | * Iteration scaling
77 | * Logical I/O scaling
78 | * Time scaling
79 | * Physical I/O scaling
80 | * Bottleneck analysis
81 |
82 |
83 |
84 | #### Tweak #1: Data Compression
85 |
86 | SQL Server supports *data compression*, where a row is compressed to save space. It turns out that row compression for the Nodes table reduces its size by ____, reducing I/O by the same amount.
87 |
88 | #### Tweak #2: Excluding converged nodes
89 |
90 | The second tweak is an algorithm change, which excludes nodes that have converged from future iterations.
91 |
92 |
93 | #### Victory!
94 |
95 | As before, my code is available [on GitHub](https://github.com/DevNambi/SqlServerUtilities/tree/master/PageRank).
96 |
97 | **Happy Coding!**
--------------------------------------------------------------------------------
/2013-12-28 Productivity Analysis.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/2013-12-28 Productivity Analysis.xlsx
--------------------------------------------------------------------------------
/2014-05-19-social-network.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-05-19
4 | layout: post
5 | slug: social-network-intro
6 | title: Introduction to Social Network Theory
7 | meta-description:
8 | - passbac
9 | - analytics
10 | - social media
11 | - social network theory
12 | - nodexl
13 | - microsoft research
14 | ---
15 |
16 |
17 |
18 | ## Interesting Insights
19 |
20 | Facebook Explorer - http://developers.facebook.com/explorer/
21 | https://developers.facebook.com/docs/graph-api/reference/v2.0/milestone
22 |
23 |
24 |
25 |
26 |
27 | ## Social Networks
28 |
29 | * Crowds matter online...they're larger than real life often, but we understand them less. Inherently weak signal.
30 | * Central tenet - social structure emerges from the aggregate of relationships among members of a population.
31 | * Emergence of cliques and clusters. Centrality (core) and periperhy (isolates), betweenness.
32 | * Methods - surveys, interviews, etc.
33 | * Social media is all about networks.
34 |
35 | Patterns are left behind.
36 |
37 | There are many kind of ties:
38 |
39 | * Send
40 | * mention
41 | * like link reply rate review favorite friend follow forward edit tag comment check-in.
42 | * one way relationships: lend money to.
43 | * bidirectional: is married to.
44 |
45 | Social media is meaningfully different from each other. They all have one thing in common: networks.
46 |
47 | The US doesn't have public squares anymore, with people who disagree with us. If it happens at all it happens online.
48 |
49 | A network is born whenever two entities are joined.
50 |
51 | Network theory: position, position, position. It's all relative.
52 |
53 | NodeXL - like social media for graphs.
54 |
55 | Trying to be the Firefox of GraphML.
56 |
57 | GraphML - XML for social networks (a data structure)
58 |
59 | Open Tools, Open Data, Open Scholarship.
60 |
61 | NodeXLGraphGallery.org - open data, user-generated collections/datasets.
62 | Open Scholarship - trying to make it easy.
63 |
64 | Try to using the tool.
65 |
66 | ### 6 social network structures
67 |
68 | Divided or unified crowds
69 | Divided - political/controversial topic.
70 | United - some communities are unified.
71 | Fragmented - brand clusters
72 | they don't reply to each other.
73 | Clustered - community clusters
74 | they interact a bit.
75 | what happens when people grow up a bit.
76 | Hub-and-spoke - broadcast network
77 | PR/marketing.
78 | Institutional speaker.
79 | Called the 'audience' pattern - people who retweet don't interact with each other.
80 | Out-hub-and-spoke - support network
81 | Airline support.
82 | @DellCares
83 |
84 | The density of the connections is how
85 |
86 |
87 | ## Centrality
88 |
89 | * Eigenvector centrality.
90 | * PageRank
91 | * Betweenness centrality - influencers. The 'bridge' score.
92 | ME - look at this for side business.
93 |
94 | * Some connections are very important. Bridges. Only 2 points of connection. But they're the only thing that connects those two networks.
95 |
96 | When you are the bridge, you may charge a toll. It could be only social capital. It's hard because you connect to something that is not like you.
97 |
98 | Don't be a hub. Be a bridge.
99 |
100 | Isolets. It means there's never been an @____ in their tweets. It means they're the new members.
101 |
102 | IDEA FOR PASS: use social network analysis to identify influencers and new people to connect with.
103 |
104 | #CMgrChat - social media managers. Basically it's a small village.
105 |
106 | Look at the social network of people who are better at this than you. Find out, and then use this analysis to figure it out.
107 |
108 | ME - read more of stuff by Marc Smith, MSR researcher
109 |
110 | http://www.connectedaction.net/
111 |
112 | Last - plea for help.
113 |
114 | Because Excel is an ODBC sourcer, anything that can join 2 tables can work in NodeXL.
115 |
116 |
117 |
--------------------------------------------------------------------------------
/2014-06-01-democratization-of-bi.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2015-06-01
4 | layout: post
5 | slug: democratization
6 | title: The Democratization of Analysis
7 | meta-description:
8 | tags:
9 | - bi
10 | - analysis
11 | - democratization of bi
12 | - statistics
13 | - programming
14 | - data science
15 | - fud
16 | - marketing
17 | - self service BI
18 | ---
19 |
20 |
21 | Fight the Hippo
22 |
23 | Not everybody is cut out for this kind of work
24 |
25 | Make better decisions
26 |
27 | Same problem as voting. Most companies are autocratic, authoritarian, even fascist (dissent will not be tolerated). The main protection is people can vote with their feet.
28 |
29 |
30 | When is it a good idea
31 |
32 | When is it a bad idea
33 |
34 | Overfitting and cross-validation.
35 |
36 | Problem: leadership doesn't know what to trust, because of FUD. Words, good stories and fancy arguments don't prove themselves without data.
37 |
38 | Know how to spot logical fallacies and statistical fallacies. Cut through the noise.
--------------------------------------------------------------------------------
/210 - System Replacements.md:
--------------------------------------------------------------------------------
1 | # System Replacements
2 |
3 | * Have to keep the old system online
4 | * The people you have aren't always the people you need
5 | * It's cruel to hire folks for the new system and fire the old folks.
6 | * Vendor or temporary hires aren't good options because of misaligned incentives.
7 |
8 | The big way forward I can see is good, flexible design at the beginning. That, and training your existing folks to incrementally build a new system and acquire new skills
9 |
10 | * There's a thing about expectations vs. reality when it comes to timing, due dates and deliverables.
11 |
12 | http://effectivesoftwaredesign.com/2014/03/17/the-end-of-agile-death-by-over-simplification/
--------------------------------------------------------------------------------
/220 - Personal Automation.md:
--------------------------------------------------------------------------------
1 | # Personal Automation
2 |
3 | http://t.co/IHnSBmoTS1
4 |
5 | A classifer for important email (content + sender), a classifier for email -> response template, and a CRM timer
6 |
7 | The slowest part was going through old email & FB messages to build a training set.
8 |
9 | http://www.matthewjockers.net/2011/09/29/the-lda-buffet-is-now-open-or-latent-dirichlet-allocation-for-english-majors/
10 |
11 | https://automatedinsights.com/blog/automation-at-work-an-interview-with-hilary-mason/
12 |
13 | Replace myself with a series of Python scripts
14 |
15 | Automation? Think Causation, not Correlation
--------------------------------------------------------------------------------
/220 - software architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/220 - software architecture.png
--------------------------------------------------------------------------------
/230 - Association Rules in SQL AdventureWorks 2012.sql:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/230 - Association Rules in SQL AdventureWorks 2012.sql
--------------------------------------------------------------------------------
/230 - Basic ML Using SQL.markdown:
--------------------------------------------------------------------------------
1 | # Machine Learning using SQL
2 |
3 | ### Key points
4 | * Move the computation to the data
5 | * People already have databases. Other languages / tools are hard to find. Also then you have to keep data sets in set, deal with extracts and such, and the result often goes back *into* an applicatio or database for use.
6 |
7 | ### The Bimodal toolset (size does matter)
8 | There are roughly 2 sets of tools nowadays. Smaller data sets (less than ~10GB or so) can fit into memory, and can be analyzed on workstations using tools like R, Python or Julia. (PROVIDE LINKS). **Large** data sets (greater than 1TB) are best analyzed using 'big data' (distributed computing) tools like Hadoop or Mahout.
9 |
10 | Between the two is where *most* data sets currently fit. They'e too big to easily fit into memory, and too small to benefit from the massive scale of Mahout. They will fit into 'big data' solutions, sure, but at smaller scales like this you run into overhead challenges.
11 |
12 | If only we had a flexible, powerful, possibly interpreted language that would work on datasets between these two sizes. It turns out we do: *SQL*.
13 |
14 | There is one other option: sampling. It's perfectly viable to take a random sample of a large data set, confirm it has the same distribution properties, and work on it using something like R or Python.
15 |
16 | For many machine learning algorithms, SQL works just fine. Let's look at some examples of how to do this.
17 |
18 | http://arcanecode.com/2013/05/07/updating-adventureworksdw2012-for-today/
19 |
20 | http://www-users.cs.umn.edu/~sarwat/RecDB/
21 |
22 |
23 | ### Matrix math in SQL
24 |
25 | A *large* amount of machine learning algorithms use matrix mathematics. Techniques such as Principal Component Analysis use it extensively.
26 |
27 | **Matrix addition**
28 |
29 | Spare and dense
30 |
31 | Use data volumes too big for R
32 |
33 | **Matrix subtraction**
34 |
35 | **Matrix multiplication**
36 |
37 | **Matrix division**
38 |
39 | **Matrix transposition**
40 |
41 | **Eigenvalues**
42 |
43 | **Eigenvectors**
44 |
45 |
46 |
47 | ### TF-IDF
48 |
49 | **Cosine similarity**
50 |
51 |
52 | ### K-Means
53 |
54 | **Euclidean distance**
55 |
56 | ? How to measure variance covered?
57 | ? How to measure variance left?
58 | ? How to use functions besides Euclidean distance?
59 | ?
60 |
61 |
62 | ### Association Rules
63 |
64 | This is used for things like 'market basket analysis'. If you buy chips at a grocery store, what *else* are you likely to buy? Turns out it is chips. If you buy diapers at a grocery store, what are you likely buy? Turns out it's beer.
65 |
66 | Association rules are designed to work on a transactional table. Luckily SQL databases tend to have several of those. Let's use an example transaction table from the AdventureWorks database, [TABLE NAME]
67 |
68 | ### Bayesian Math
69 |
70 |
71 |
72 | ### Decision Trees
73 |
74 | ### Statistics
75 |
76 | * Percentiles
77 | * Boxplot
78 | * Median
79 | * Mode
80 | * Distribution
81 | * Correlation
82 | * T-test
83 | * Mutual information criterion
84 | * Rolling average
85 | * Trailing average
--------------------------------------------------------------------------------
/230 - CameraAwesomePhoto.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/230 - CameraAwesomePhoto.jpg
--------------------------------------------------------------------------------
/240 - SQL and Digraphs.markdown:
--------------------------------------------------------------------------------
1 | # SQL and dependency graphics (digraphs)
2 |
3 | Blog post on SQL dependency graphs
4 |
5 | The most popular ML algorithms are:
6 |
7 | Decision Trees / Regression Trees
8 | Linear Regression
9 | K-Means
10 | Association rules
11 |
12 | Apache Spark uses a DAG
--------------------------------------------------------------------------------
/250 - Finding a Vacation Using Data.markdown:
--------------------------------------------------------------------------------
1 | # Quick Guide for a vacation
2 |
3 | Kate and I were wondering about the best way to have a vacation. Also, *why* a vacation?
4 |
5 | ### Why A Vacation
6 |
7 | People are not designed to work continuously. We certainly didn't evolve over hundreds of thousands of years to be indoors all the time, nor sitting, nor at a desk job.
8 |
9 | Also, there's the fundamental question of: why are we here? To help our employers make money? To make the world a better place? To enjoy ourselves?
10 |
11 | I'd argue it's the last two.
12 |
13 | People aren't deterministic. They wear down over time. Their productivity is unpredictable, bursty, and prone to lots of different factors.
14 |
15 | Many of the most effective engineers I know take mental health days and have lots of hobbies. They use their vacation time. They're invariably intentional about it.
16 |
17 | Life: optimize time for X. X is what you care about. Time is what you can trade for it.
18 |
19 | Vacations are like that.
20 |
21 | **Implications**
22 |
23 | * Don't go on an expensive vacation. Taking a weeklong trip to Hawaii may be pointless if you have to work for a month extra to pay for it.
24 | * Think about end goals.
25 |
26 |
27 |
28 | ## Cost
29 |
30 | * Airfare
31 | * Rental car
32 | * Places to stay
33 | * Luggage limitations - buying things
34 |
35 |
36 | ## Environmental impact
37 |
38 | Airplanes - .638 to 1 pound of CO2 per passenger per mile. For both of us to fly to San Francisco (a distance of <> miles) means a CO2 footprint of <> pounds.
39 |
40 | Driving - We drove 10,508 miles last year, using 268.4 gallons of gas. That comes to an average of 39.2 miles per gallon. We get better gas mileage on freeways because they aren't as hilly (hills are death to a Prius' gas mileage). Given that there are 19.6 pounds of CO2 in a gallon of gas, that comes out to a carbon footprint of 5261 pounds, or roughly 1/2 pound of CO2 per mile.
41 |
42 | There is also the carbon footprint to create the car, amortized. A Prius takes about <> pounds of CO2 to make, and will be lasting us hopefully 120K to 180K miles, since we bought it with 27K miles on the car. Assuming the lifetime mileage for a Prius is about 180K miles, that comes out to <> pounds of CO2 per mile to run the car.
43 |
44 | Rental Cars - rental cars are usually newer cars. A big part of .
45 |
46 | If we drove to San Francisco, that'd be carbon footprint of.
47 |
48 | There are other reasons as well.
49 |
50 |
51 | * Don't go so fast
52 | * Stop and enjoy the sites.
53 | * Variety, and serendipity, are the spice of life
54 |
55 |
56 |
--------------------------------------------------------------------------------
/260 - Startups and Y Combinator.markdown:
--------------------------------------------------------------------------------
1 | # On Startups
2 |
3 | Y Combinator is the elephant in the room. It has created companies like AirBnB, Dropbox, FlightCar.
4 |
5 | There's a theme with all of these:
6 |
7 | *Make existing resources more efficient*
8 | *Build a network of supply and demand*
9 |
10 | Company: AirBnB
11 | Demand: People who want to stay somewhere overnight, for a few days. Vacationers, business folks at work.
12 | Supply of Spare Resources: Existing homes that are vacant. Spare rooms. Backyard cottages.
13 |
14 | Company: FlightCar
15 | Demand:
16 | Supply of Spare Resources: People who leave their cars at the airport
17 |
18 | Company: RescueTime
19 | Demand:
20 | Supply of Spare Resources: People who are working inefficiently.
21 |
22 | Company: Uber, Lyft
23 | Demand: People who want rides
24 | Supply of Spare Resources: People with cars and a bit of spare time.
25 |
26 | Company: Dropbox
27 | Demand:
28 | Supply of Spare Resources:
29 |
30 | Company: Payscale, Glassdoor
31 | Demand: People wanting to know about pay and working conditions in different jobs
32 | Supply of Spare Resources: People who are currently employed in different companies and can complain/brag about their
33 |
34 |
35 | http://www.industrytap.com/the-printer-that-can-print-a-house-in-20-hours/9056
36 |
37 | Company:
38 | Demand:
39 | Supply of Spare Resources:
40 |
41 | Company:
42 | Demand:
43 | Supply of Spare Resources:
44 |
45 | Company:
46 | Demand:
47 | Supply of Spare Resources:
48 |
49 | Company:
50 | Demand:
51 | Supply of Spare Resources:
52 |
53 | Company:
54 | Demand:
55 | Supply of Spare Resources:
56 |
57 | http://siliconhillslawyer.com/2014/03/15/409a-service-cash-cows-get-slaughtered/
58 |
59 | http://www.wired.com/2014/04/no-exit/?hn
--------------------------------------------------------------------------------
/270 - Smell Test Dilbert.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/270 - Smell Test Dilbert.jpg
--------------------------------------------------------------------------------
/270 - The Smell Test.md:
--------------------------------------------------------------------------------
1 | ## The Smell Test
2 |
3 | * Longevity of code
4 | * Good judgment can be instinctual
5 | * Best way to develop is trial and error
6 | * Can't be taught, must be learned the hard way
7 | * Useful when applying your knowledge to different situations
8 | * Different way of thinking. Not conscious per se.
9 | * http://dilbert.com/strips/comic/2014-11-06/
--------------------------------------------------------------------------------
/280 - Agile and Waterfall.md:
--------------------------------------------------------------------------------
1 | ## Agile and Waterfall
2 |
3 | ### Story sizing
4 | * More important when in a semi-waterfall environment.
5 | * Good signal that your story process has too much overhead
6 | * Important for 'agilefall' because overhead happens all the time.
7 |
8 |
9 | ### Agilefall
10 |
11 | * The mullet of software development. Agile at its core, waterfall around it.
12 | * Project mgmt nightmare.
--------------------------------------------------------------------------------
/290 - Data Science Evolution.md:
--------------------------------------------------------------------------------
1 | ## Data Science and Data Warehousing
2 |
3 | They are a cautious match.
4 |
5 | Data Warehousing - stability.
6 |
7 | Data Science - discovery, new things.
8 |
9 | * Those cause friction.
10 | * Feature engineering is a big problem - a lot of ML algorithms work only when you have the right data. DW limits what you get, and drastically slows down what you can add.
11 | * Tooling is another. DW is largely relational databases and cubes. Cutting edge is viz tools like Tableau, and 'big data' tools like Hadoop/Hive.
12 | * Data Science uses an overlapping tool set, including things like
13 | * DW includes a lot of process overhead. The reason is the assumption that 'if you build it, they (analysts) will come'. That doesn't happen very often. Also, operational reporting (what has happened) is far, far easier than predictive analysis (what will happen) and optimization (how can I change what will happen to be optimal). They have similar tooling, sometimes, but very dissimilar skills.
14 | * Operational reporting - no margin of error.
15 | * Prediction - margin of error. Limitations in what can be predicted with the data.
16 | * I've seen job postings for 'big data engineers' or 'platform engineers' that focuses largely on pipelines. 'Pipelines' are a natural fit for ETL developers and anyone who is comfortable with query optimization; the principles are the same, but the tools are different.
17 | * Not design-heavy. Data modeling happens *after* you know what you need to build. DS helps build data *products*. The data model for Netflix's "Movies you may like" is far less important than the application itself.
18 |
19 | **Becoming a DS**
20 |
21 | Starting over
22 | Rejecting jobs and work you are qualified for. Thats a trap.
23 | Being humble. Admit you don't know crap. Learn from smarter people.
--------------------------------------------------------------------------------
/320 - Feature Engineering.md:
--------------------------------------------------------------------------------
1 | ## Feature Engineering
2 |
3 | * Longtitude and Latitude as an example
4 | * Mention deep learning
5 | * Adding different data sets. They're often from public sources.
6 | * Data cleansing / munging is huge. It's 80% of data science.
7 | * Not all attributes are created equal. In fact, they are dramatically unequal. They are also only identified using ML trial and error. Design and HIPPO won't help you here.
8 | * Goes with the idea that data is abundant.
9 |
10 | http://www.analyticshumor.com/search?updated-max=2014-04-03T08:24:00-07:00&max-results=10#sthash.Gqa5F4va.uxfs
--------------------------------------------------------------------------------
/330 - Cognition for Data Professionals.md:
--------------------------------------------------------------------------------
1 | ## Limits of Cognition for Data Professionals
2 |
3 | * We are limited by our own physiology.
4 | * Attention spans, pomodoro.
5 | * Using sampling for rapid iteration. Stats helps with this.
6 | * Data visualization
7 | * Visualization *friction*
8 | * We can only see 7 items in working memory. But a pattern is an item.
9 | * Sleep is huge.
10 | * Eating well is huge.
11 | * Cost of distractions.
12 | * Telling a good story is important for this reason. People think in story and narrative, not numbers.
13 | * Book about limitations of smart people.
14 | * Thinking Fast and Slow.
15 | * Even statistically literate people don't do so intuitively. Psychology plays tricks against us.
16 | * Having people vet our work is helpful.
17 | * Trying to prove ourselves wrong is also helpful.
18 | * So is thinking of things from a fresh perspective.
19 | * Creativity is destroyed when you're too busy. Go for a walk. Take an extra shower.
20 | * Carry a little pad of paper around. Good ideas happen at random moments. Capture them.
21 | * (Articles in the \Health and \Lifehacker sections of Pocket)
22 | * Curiosity and pride. Humility is helpful.
23 |
24 | "Work is most fulfilling when you're at the comfortable, exciting edge of not quite knowing what you are doing." - https://twitter.com/alaindebotton
25 |
26 | * know your own skills
27 | * Know your weaknesses
28 | * Know your effect on your company, and the company's effects on the world.
29 |
30 | Look for wisdom everywhere. Difference between expert and expert beginner is self-reflection, realization and changing behavior. Self-awareness is the key.
31 |
32 | * http://www.theatlantic.com/health/archive/2013/10/how-to-build-a-happier-brain/280752/
33 | * http://www.newrepublic.com/article/118714/interruptions-work-make-you-way-less-productive
34 | * http://georgestocker.com/2014/04/15/how-to-destroy-programmer-productivity/
35 | * It's not about what you know, but rather your framework for adding more knowledge.
36 | * https://medium.com/@maebert/9-things-i-learned-as-a-software-engineer-c2c9f76c9266
37 | * http://www.sfu.ca/pamr/media-releases/2014/scientists-discover-brains-anti-distraction-system.html
38 | * http://well.blogs.nytimes.com/2014/03/10/do-brain-workouts-work-science-isnt-sure/?src=me&ref=general
39 | * http://ayearofproductivity.com/top-lessons-learned-a-year-of-productivity/
40 | * http://joshldavis.com/2014/06/13/put-yourself-out-there/
41 | * https://medium.com/@jakek/my-year-with-a-distraction-free-iphone-and-how-to-start-your-own-experiment-6ff74a0e7a50
42 | * http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0111081
--------------------------------------------------------------------------------
/330 - Cognition for Data Pros.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/330 - Cognition for Data Pros.png
--------------------------------------------------------------------------------
/330 - know all the things.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/330 - know all the things.jpg
--------------------------------------------------------------------------------
/350 - Example Math.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/350 - Example Math.xlsx
--------------------------------------------------------------------------------
/350 - Matrix Prioritizaton.md:
--------------------------------------------------------------------------------
1 | ## Matrix Prioritization
2 |
3 | * Consider both importance and trimmed outliers.
4 | * You want the most healthy *mix* of attributes, not the best in a particular area
--------------------------------------------------------------------------------
/360 - Life is an Optimization Problem.md:
--------------------------------------------------------------------------------
1 | ## Optimizing Life
2 |
3 | Life is an optimization problem
4 |
5 | * No single thing is good if you scale it up infinitely.
6 | * Very few things are good if you have nothing of them.
7 | * There is dimishing returns everywhere.
8 |
9 | * https://nplusonemag.com/issue-21/the-intellectual-situation/too-fast-too-furious/
10 | * http://www.kpcb.com/design/how-to-be-happy-in-business-by-bud-cadell
11 | * http://www.seattletimes.com/opinion/living-the-small-happy-life-surprisingly-important-to-many/
12 |
13 | ### Examples
14 |
15 | * Sleep
16 | * Food
17 | * Money
18 | * Housing
19 | * Exercise
20 | * Friends
21 | * Time spent with X
22 | * Programming, being productive
23 | * Playing (games, etc)
24 |
25 | Judge everything on a scale of -3 to 3. 0 means it neither adds nor detracts from your life.
26 |
27 | Then it's a linear algebra problem. What do optimal solutions look like?
--------------------------------------------------------------------------------
/370 - Industry Comparisons.md:
--------------------------------------------------------------------------------
1 | # Industry Comparisons
2 |
3 | ## Keep Goals In Mind
4 |
5 | * Schools - it's not a race
6 | * Sports - the winner vs. everyone else
7 | * Nurses - false-negatives are bad (you want to be alerted all the time)
8 | * Fighter pilots - false positives are bad (don't fire on airliners)
9 | * General issues
10 | * Financialization - http://www.nakedcapitalism.com/2014/06/wikileaks-exposes-super-secret-regulation-gutting-financial-services-pact.html
11 |
12 | ## Government
13 |
14 | **Military spending**
15 | Contractor
16 | Cost after inflation
17 | Cost overruns
18 | Lifetime cost
19 | Number compared to previous generation
20 | Number of overseers
21 |
22 | ## Insurance
23 |
24 | • Insurance post is in \SQL Blog\Cheap Car
25 | * http://techcrunch.com/2014/06/21/will-google-enter-the-insurance-industry/
26 |
27 | • Don't treat it the same across years
28 | • Do a two-year trailing average
29 | • How to account for differing premiums charged? It's a huge confound
30 | ○ Different lesson? You get what you pay for?
31 | ○ How to project expected return? Premium X loss ratio?
32 | ○ I need to include personal examples from 3 different companies. Yuck
33 | • How to get # of rejected complaints?
34 | • Mutual insurance or not?
35 | • For profit or not?
36 |
37 | http://www.insure.com/articles/interactivetools
38 |
39 | ## Loss Ratio
40 | @DevNambi Publicly traded companies release quarterly reports that include some of that info. cc @erinstellato
41 |
42 | ## Sharing Economy
43 |
44 | Why Portland is keeping Uber out of the Rose City - GeekWire
45 | Seattle City Council worries about gaps in ride-service insurance | Local N
46 |
47 | ### Government
48 |
49 | * Libraries - http://online.wsj.com/news/articles/SB20001424052702303996604580086191560891202?mg=reno64-wsj&url=http%3A%2F%2Fonline.wsj.com%2Farticle%2FSB20001424052702303996604580086191560891202.html
50 |
51 | ### Shared Cars
52 |
53 | http://mattstoller.tumblr.com/post/82233202309/ubers-algorithmic-monopoly-we-are-not-setting-the
54 |
55 | http://www.nytimes.com/2014/04/22/business/companies-built-on-sharing-balk-when-it-comes-to-regulators.html?_r=0
56 |
57 | http://www.wired.com/2014/04/trust-in-the-share-economy/
58 |
59 | http://www.theverge.com/2014/6/17/5816254/taskrabbit-blows-up-its-auction-house-to-offer-services-on-demand
60 |
61 | http://blogs.citypaper.com/index.php/the-news-hole/desperate-hustle-way-life/
62 |
63 | http://lefsetz.com/wordpress/index.php/archives/2014/07/05/kids-dont-care-cars/
64 |
65 | http://bits.blogs.nytimes.com/2014/08/28/uber-and-lyft-have-become-indistinguishable-commodities/
66 |
67 | http://www.wired.com/2014/10/volvo-turbo-engine-concept/?mbid=social_fb
68 |
69 |
70 | ## Automation
71 |
72 | ### Self-Driving Cars
73 |
74 | http://seattletimes.com/html/nationworld/2023106759_apxdriverlesscars.html
75 |
76 | ### Cooking Robots
77 |
78 |
79 |
80 | * Advertising: The price we pay for being a broke society
81 | * http://tiltthewindmill.com/breather-real-estate-and-the-innovators-dilemma/
82 | * http://money.cnn.com/2014/10/15/technology/security/malvertising/index.html?iid=HP_River
83 |
84 | http://www.theatlantic.com/politics/archive/2014/04/city-state-governments-privatization-contracting-backlash/361016/
85 |
86 | - Industries to disrupt
87 | • Law
88 | • Medicine
89 | § http://www.theatlantic.com/health/archive/2012/10/why-were-still-waiting-on-the-yelpification-of-health-care/263815/
90 | • Realtors
91 | • Any cottage industries
92 | • Education
93 | § http://oedb.org/open/
94 | § Email Daniel Strauss when I do
95 |
96 | http://arstechnica.com/science/2014/04/publishing-stings-find-predatory-journals-shoddy-peer-review/
97 |
98 | http://www.salon.com/2014/06/20/the_music_industry_is_still_screwed_why_spotify_amazon_and_itunes_cant_save_musical_artists/
--------------------------------------------------------------------------------
/380 - Trust.md:
--------------------------------------------------------------------------------
1 | # Trust
2 |
3 | "Vaccine-autism fraud another reminder that people with wildly generalized mistrust turn out to be the biggest suckers for crazy stuff."
4 |
5 |
6 | * Specific vs. generic
7 | * "Trust but verify"
8 | * What do trustworthy companies/people have in common?
--------------------------------------------------------------------------------
/400 - Software as a Craft.markdown:
--------------------------------------------------------------------------------
1 | # The Software Guild
2 |
3 |
4 | - Craft
5 | - Hardware as craft tools
6 | § Monitors, keyboards, mice are like power tools, drills for craftsmen
7 | § Advocate a hardware budget, people can buy their own
8 | § Hell for IT, but not bad
9 | § Same for furniture?
10 | - Blog post on guild laws
11 | - Blog post on the craft of software engineering
12 |
13 | * Post on office setup
14 | * http://www.wired.com/2014/11/ikea-bekant-desk/
15 |
16 |
17 | codinghorror: Here's where you can order the 2013 Software Craftsmanship Ca
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
--------------------------------------------------------------------------------
/400 - software.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/400 - software.jpg
--------------------------------------------------------------------------------
/410 - Scientific Method.markdown:
--------------------------------------------------------------------------------
1 | # Science and Data Professionals
2 |
3 | - Blog post - the scientific method vs how to troubleshoot
4 | • Confounds
5 | • Experimental design vs what happened before
6 | • Isolating a single variable
7 | • Correlation vs causation
8 | • When is correlation enough
9 |
10 | IT pros and scientists are similar
11 |
12 |
13 | http://arstechnica.com/science/2014/09/is-there-a-creativity-deficit-in-science/
--------------------------------------------------------------------------------
/420 - Inductive vs Deductive.markdown:
--------------------------------------------------------------------------------
1 | # Inductive vs. Deductive Learning
2 |
3 | - Inductive vs deductive learning and engineering
4 |
--------------------------------------------------------------------------------
/430 - Hiring.md:
--------------------------------------------------------------------------------
1 | # Hiring
2 |
3 | * branded employees
4 |
5 |
6 | ## Hiring
7 |
8 | * https://medium.com/@ChaseTheTruth/hire-the-wisest-not-the-smartest-68b8b640ab5e
9 |
10 | ## How to find good DBAs or developers
11 | This is an information problem. It's also a FUD problem.
12 |
13 | ## Getting started (a guide for students)
14 | - Guide for new STEM students
15 |
16 |
17 | * http://www.mintzberg.org/blog/mbas-as-ceos <- MBAs make things worse
18 |
19 | * Questions to ask
20 | * Where to go looking
21 | * https://zapier.com/blog/remote-office-photos/
22 | * http://www.groovehq.com/blog/being-a-remote-team
23 | * http://paddy.io/posts/recruiters/
24 | * http://www.nytimes.com/2015/05/31/opinion/sunday/guess-who-doesnt-fit-in-at-work.html
25 | * https://medium.com/@joethorntonPF/structured-vs-unstructured-interviews-e35adef75db8
26 | * https://www.brentozar.com/archive/2016/04/interview-dbas-dont-ask-questions-show-screenshots/
27 | * http://andytroutman.com/articles/2013/01/24/rockstar-programmers-are-not-assholes.html
28 | * http://www.wired.com/2014/02/smart-jerks-old-people-hard-things-company/
29 | * https://www.shrm.org/resourcesandtools/hr-topics/technology/pages/it-employers-would-pay-15-percent-more-for-top-talent.aspx
30 | * https://medium.com/latticehq/how-much-does-employee-turnover-really-cost-d61df5eed151
31 | * http://www.b-list.org/weblog/2015/oct/19/destroy-all-hiring-processes/
32 | * https://www.linkedin.com/today/post/article/20140527132535-50510-interviewing-engineers-is-a-team-sport
33 | * http://michaelochurch.wordpress.com/2014/02/06/if-you-stop-promoting-from-within-soon-you-cant/
34 | * http://blog.landing.jobs/why-hunting-for-unicorns-is-bullshit-and-how-to-hire-a-great-ux-designer/
35 | * http://www.huffingtonpost.com/susan-p-joyce/job-search-tips_b_4834361.html
36 | * http://blog.fogcreek.com/were-bad-at-interviewing-developers-and-how-to-fix-it-interview-with-kerri-miller/
37 | * http://firstround.com/article/Mine-Your-Network-for-Early-Stage-Hiring-Gold
38 | * https://www.nczonline.net/blog/2015/09/my-favorite-interview-question/
39 | * https://medium.com/@evnowandforever/f-you-i-quit-hiring-is-broken-bb8f3a48d324
40 | * http://blog.triplebyte.com/three-hundred-programming-interviews-in-thirty-days
41 | * https://medium.com/ride-tech-blog/open-sourcing-our-interviewing-preparation-guide-102021f81626
42 | * Psychology - because we're looking for good judgment
43 | * Functional literacy is disempowering.
44 | * We know how to use a tool, but not when/why. The 'when all you have is a hammer' syndrome
45 | * http://firstround.com/article/Heres-Why-Youre-Not-Hiring-the-Best-and-the-Brightest
46 | * http://www.codecademy.com/blog/142-why-building-a-data-science-team-is-deceptively-hard
47 | * http://rustyrazorblade.com/2014/09/21-ways-to-minimize-employee-retention/
48 | * http://marlagottschalk.wordpress.com/2014/10/03/losing-talent-go-ahead-tell-yourself-its-mutual/
49 | * http://blog.alinelerner.com/resumes-suck-heres-the-data/
50 | * http://weblog.raganwald.com/2006/06/my-favourite-interview-question.html
51 | * http://carlos.bueno.org/2014/06/refactoring.html
52 | * http://swizec.com/blog/dear-tech-companies-this-is-not-how-you-hire-engineers/swizec/6643
53 | * http://www.brendangregg.com/blog/2017-11-13/brilliant-jerks.html
54 |
55 | ### Ways to troll recruiters
56 |
57 | "Sure, I know just the person to talk to!" <- refer them to another recruiter with a fake resume.
58 |
59 | * http://imgur.com/a/ZpNzE
60 | * http://blog.42floors.com/striking-back-recruiter-spam/
61 | * http://qz.com/258066/this-is-why-you-dont-hire-good-developers/
62 | * http://radar.oreilly.com/2014/10/resume-driven-development.html
63 | * http://www.cringely.com/2014/09/28/enemy-hr/
64 |
65 | Secretary Puzzle
66 | Why I Love Being A Programmer in Louisville (or, Why I Won’t Relocate to Wo
67 | adamlaiacano: "If you're going to hire 30 people, you're going to interview
68 | Never Have the "What Would It Take to Keep You Here?" Conversation - Rand's
69 | How should I add a new developer to the team? | Ars Technica
70 | Referly BlogThe Most Revealing Interview Question - Referly Blog
71 | What Company Culture IS and IS NOT - Rand's Blog
72 | Insider Secrets for Hiring Great People: Avoid the Big Mistakes | LinkedIn
73 | Elad Blog: Reference Check Candidates
74 | 7 Reasons I’ll Turn Down a Job After Interviewing With You
75 | Referly BlogThe Most Revealing Interview Question - Referly Blog
76 | How Stripe built one of Silicon Valley’s best engineering teams
77 |
78 |
79 |
80 |
81 |
82 |
83 |
84 |
85 |
86 |
87 |
88 |
89 |
90 |
91 |
92 |
93 |
94 |
95 |
96 |
97 |
98 |
99 |
100 |
101 |
102 |
--------------------------------------------------------------------------------
/431 - CV of Failures.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/431 - CV of Failures.pdf
--------------------------------------------------------------------------------
/431 - job searches as developer.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/431 - job searches as developer.png
--------------------------------------------------------------------------------
/431 - resume viz.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/431 - resume viz.png
--------------------------------------------------------------------------------
/431- interviewing honesty.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/431- interviewing honesty.jpg
--------------------------------------------------------------------------------
/431-decoding-job-descriptions.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/431-decoding-job-descriptions.jpg
--------------------------------------------------------------------------------
/432 - Bad Work Situations.md:
--------------------------------------------------------------------------------
1 | # Bad Work Situations
2 |
3 | http://robertehall.com/2014/03/disengagement-economy-robert-hall-huffington-post/
4 |
5 | What are coping mechanisms?
--------------------------------------------------------------------------------
/432 - fail.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/432 - fail.jpg
--------------------------------------------------------------------------------
/440 - Database Development.markdown:
--------------------------------------------------------------------------------
1 | # Database Development Series
2 |
3 | - Blog post - process of query tuning
4 | § Long running vs run many times
5 | § Eliminate blocking as a culprit
6 | § What's left
7 | § Crappy hardware
8 | § Query the optimizer can't do much with
9 | § Unreasonable demands (billions of rows)
10 | § Firehose analogy
11 | § Fire hydrant is too small (hardware)
12 | § Hose is too small (query)
13 | § Too much water (rowcount)
14 | § Factory analogy
15 | § Raw ingredients (hardware)
16 | § Factory (query opt)
17 | § Both
18 | § Get estimated and actual query plan
19 | § Check using plan explorer
20 | § Long running
21 | § Get actual query plan
22 | - Things that defeat the optimizer - case statements, functions, tables with data correlations, table variables
23 |
24 | Things that the opt loves - string literals, number-based joins, tables with random distributions, a sensible number of tables to join
25 |
26 | - Blog post on DB development best practices
27 | - Blog post on using EXCEPT or INTERSECT to make sure query refactors return the same data
28 | Best practices for ETL logging
29 | Include the spid
30 | Include @@rowcount
31 | Include sproc name
32 | Include dynamic SQL
33 | Include server name, login
34 |
35 |
36 | - Blog post idea - DB testing
37 | Regression testing
38 | Deployment validation
39 | Whitebox testing
40 | Black-box testing
41 | - ME: go to datamanipulation.net/sqlquerystress - load tool
42 |
43 | Blog post on release cadences
44 | The slowest wins. Everybody is forced to adjust for that.
45 |
--------------------------------------------------------------------------------
/450 - Engineering Constraints.markdown:
--------------------------------------------------------------------------------
1 | # Engineering with Constraints
2 |
3 | We live in a world of constraints, trade-offs, and complex decisions. It is human nature to use heuristics and previous experience to limit our own choices and make rapid decisions (LINK TO BOOK ABOUT THIS, IN MY ROOM).
4 |
5 | Data structures in real life. Analogies to help people learn
6 |
7 | ### Time X fn_Y(Complexity) X fn_X(People) X fn(Motivation) X fn(Overhead) = constant
8 |
9 | When building software, there are some inherent limits.
10 |
11 | Venn diagram between speed, features, code debt/cleanliness/bugs
12 |
13 | What is the relationship between tools, brains, complexity, opportunity, and judgment?
14 |
15 | Blog idea = data quality, requirements, brains - comm overhead = constant
16 |
17 | The implications are unsettling.
18 |
19 | Human factors in engineering
20 | • Nonlinear factors
21 | • Non-determinism
22 | • Error rates
23 |
24 | #### Approach 1: Use documentation to reduce time required
25 |
26 | The idea is noble: use documentation to reduce complexity. Unfortunately, it's also not well thought out. Documentation has inherent bias, and is usually out of date a few minutes after it's written. The only exceptions to this appear to be documentation that is automatically rebuilt from the code. After all, **The Code Is The Law** (LINK).
27 |
28 | #### Approach 2: Add more people to speed things up
29 |
30 | Anyone who has read the Mythical Man-Month (LINK) knows this limitation. People don't scale. The communication overhead involved rapidly makes it harder . This is especially true with unskilled or unmotivated people; it's often faster without them than with them.
31 |
32 | #### Approach 3: Make things simpler
33 |
34 | This is a great idea in general. Unfortunately, making things *too* simple runs into the opposite problem: it's hard to do anything without adding complexity.
35 |
36 | #### Approach 4: No process
37 |
38 | This is also a decent idea. However, it is dependent upon second-order effects of your engineering team. They need to be adaptable, self-critical, and . The gains are often increased speed as inefficient processes are removed.
39 |
40 | Adds overhead, process
41 | Get enough of the big picture, get the details, and GO
42 | Best way to go fast is to go slow, and pare down to the essentials
43 | Get better at working not by studying how to work, but by working and using reflective practice
44 | Formality is overrated.
45 |
46 | #### Work everybody harder.
47 |
48 | This may work in the short term. In the long term, it is less efficient exhausting, attrition and morale problems creep up. There are also physiological limits; it's statistically unlikely that your engineering team can maintain 20-hour days or 100-hour weeks and stay mentally sharp.
49 |
50 | Also, it sends a terrible message. Any executive or manager thinks their employees should feel grateful for a grueling job at a pittance doesn't understand human psychology. People aren't robots. Their reactions are entirely non-linear and unpredictable for that, for which I'm grateful.
51 |
52 | #### Flatten
53 |
54 | This is one of my favorite approaches, because it removes overhead. It is also motivating.
55 |
56 | #### Make an 'innovative' team inside a old beast
57 |
58 | Blog post on agile development inside a waterfall framework
59 | • Messaging
60 | • Fitting stuff inside a timeline
61 | Inevitable friction arises.
62 | It also causes resentment on both sides. A rockstar team will feel like they are 'propping up' all these crappy other groups. The other groups will feel marginalized.
63 |
64 | #### Use a 'framework' or 'layer' to encapsulate and extend.
65 |
66 | Most of the time you're not reducing complexity. You're just hiding it.
67 |
68 | ### Price vs. value is non-linear. You also end up with various interfaces and APIs that you have to maintain
69 |
70 |
71 |
72 | Ratio of code to features
73 |
74 | - Blog post idea - price vs value
75 | § It's not linear
76 | § It's exponential
77 |
78 | Time series illustrates this well
79 |
80 |
81 | ### Trust is non-linear
82 |
83 | - Tension between prototyping (agile, incomplete) and the degradation of trust
84 |
85 | Requirements & user expectations is where things break down. Can't be agile & get them in a waterfall fashion
86 |
87 |
88 |
89 | ### Prototypes and Engineering
90 |
91 | * Humility
92 | * Getting it right the first time
93 | * Private failures and public successes
94 | * Knowing the goal is important
95 | - Tension between prototyping (agile, incomplete) and the degradation of trust
96 | Blog post on adapting to new changes
97 | Safely
98 | Wisely
99 |
100 | Trial periods are good for this
101 |
102 | http://blog.hut8labs.com/speeding-up-your-eng-org-part-i.html
103 |
104 | ### Ideas vs Execution
105 |
106 | "No business plan survives contact with reality"
107 | "No architecture survives contact with hardware"
108 | - Formality is overrated
109 | Adds overhead, process
110 | Get enough of the big picture, get the details, and GO
111 | Best way to go fast is to go slow, and pare down to the essentials
112 | Get better at working not by studying how to work, but by working and using reflective practice
113 |
114 | http://ejohn.org/blog/write-code-every-day/
115 |
116 | http://scottberkun.com/2014/critique-dont-fuck-up-culture/
117 |
118 | http://users.ece.utexas.edu/~adnan/pike.html
119 |
120 |
121 |
122 | # WHERE TO ADD?
123 | http://highscalability.com/blog/2012/2/27/zen-and-the-art-of-scaling-a-koan-and-epigram-approach.html
124 | - How to improve processes between business, developers, and DBAs
125 |
126 | - Agile requires good working conditions. Why? Because the pace is so rapid that people are the domain knowledge. They're even more critical. That means turnover is more disruptive than in slower organizations.
127 | § Blog: people add process to compensate for individual failings. And to set expectations. Why not go for more competence instead? Isn't process defeatist?
128 |
--------------------------------------------------------------------------------
/460 - Reputation Systems and PageRank.markdown:
--------------------------------------------------------------------------------
1 | - Learn more, blog about PageRank algorithm
2 | • How can it be used elsewhere?
3 |
4 | Reputation
5 |
6 |
7 | ### Next Few Years
8 | * Data Science over the next few years will be darwinistic.
9 | * Companies that can be data-driven will thrive.
10 | * Others will die
11 | * Data Science as a C-level position. Strategic decisions about data will be C-level decisions w/ a management chain
12 |
13 | ? Can you build data creativity as a muscle?
14 |
15 |
16 | ** http://blogs.wsj.com/moneybeat/2014/12/19/buffett-reminds-his-top-managers-reputation-is-everything/
--------------------------------------------------------------------------------
/470 - Amazon.md:
--------------------------------------------------------------------------------
1 | # The 'Efficiency' Dystopia
2 |
3 |
4 | I hear a lot about 'efficiency' and 'progress'. My question, inevitably, is: who benefits? What does it cost? Do the benefits outweigh the costs?
5 |
6 | One of the biggest names in 'efficiency' and 'customer focus' is [Amazon.com](http://www.amazon.com). However, their relentless 'customer service' comes at a horrific price for the warehouse workers they 'employ'.
7 |
8 | * [Salon.com](http://www.salon.com/2014/02/23/worse_than_wal_mart_amazons_sick_brutality_and_secret_history_of_ruthlessly_intimidating_workers/)
9 | * [The Guardian](http://www.theguardian.com/technology/2013/dec/01/week-amazon-insider-feature-treatment-employees-work)
10 | * [Re/Code](http://recode.net/2014/06/30/amazon-was-a-prison-says-former-worker/)
11 | * [Gawker](http://gawker.com/true-stories-of-life-as-an-amazon-worker-1002568208)
12 | * [McCall](http://www.mcall.com/business/mc-amazon-temporary-workers-unemployment-20121215-story.html#page=1)
13 | * [Mother Jones](http://www.motherjones.com/politics/2012/02/mac-mcclelland-free-online-shipping-warehouses-labor)
14 | * [Forbes](http://www.forbes.com/sites/eamonnfingleton/2013/11/25/amazon-com-is-accused-of-slave-driving-after-bbc-secretly-videotaped-warehouse-conditions/)
15 | * [Business Insider](http://www.businessinsider.com/brutal-conditions-in-amazons-warehouses-2013-8)
16 | * [The International Business Times](http://www.ibtimes.com/amazoncoms-workers-are-low-paid-overworked-unhappy-new-employee-model-internet-age-1514780)
17 |
18 |
19 | This isn't limited to the U.S. In Germany, which has a strong union tradition, workers (read: people) are protesting [because the working conditions are inhumane](http://seattletimes.com/html/specialreportspages/2024340124_amazongermanyxml.html). Clearly this is because Germans are well known to be good-for-nothing slackers.
20 |
21 | In the U.K, the BBC did an [undercover investigation](https://www.youtube.com/watch?v=CXWJ4GfQ22E) to show what working at one of the warehouses is like.
22 |
23 | It is painfully obvious that the people who work to deliver goods for Amazon aren't treated like people; they are treated like cogs, to be worn down and discarded, because *there are always more cogs*. Letting people work at a sane pace, giving them access to medical care, heat, or decent bathroom breaks would put a (miniscule) dent in the bottom line.
24 |
25 | This is a more profitable way to run a business. That's the goal, right?
--------------------------------------------------------------------------------
/480 - computing women.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/480 - computing women.jpg
--------------------------------------------------------------------------------
/480 - lego_gender.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/480 - lego_gender.jpg
--------------------------------------------------------------------------------
/480 - racism and bigotry.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/480 - racism and bigotry.jpg
--------------------------------------------------------------------------------
/480 - recruiting WIT.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/480 - recruiting WIT.jpg
--------------------------------------------------------------------------------
/480 - what happens we're out.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/480 - what happens we're out.png
--------------------------------------------------------------------------------
/480 - women_astronomer.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/480 - women_astronomer.jpg
--------------------------------------------------------------------------------
/480- perfectcrime.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/480- perfectcrime.png
--------------------------------------------------------------------------------
/490 - Chart of Cosmic Exploration.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/490 - Chart of Cosmic Exploration.jpg
--------------------------------------------------------------------------------
/490 - scientific method.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/490 - scientific method.jpg
--------------------------------------------------------------------------------
/490 - what would feynman.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/490 - what would feynman.png
--------------------------------------------------------------------------------
/500 - Intro to Caching and Core Algos.markdown:
--------------------------------------------------------------------------------
1 | # Caching
2 |
3 |
4 | • Caching and eviction policies
5 | § LRU (least recently used)
6 | § Time-based
7 | § LRU w/ priority (risk)
8 | • Ways to store data in cache for fast retrieval
9 | § Similar to Thomas Kesjer's 'grade of the steel' blog posts
10 | § Use Powershell? C#? C++?
11 | • O(1) and O(log N)
12 |
13 | Go over core algorithms. Think about Peter Thiel's limitations of current software to rank them by importance.
14 |
15 | - Blog post - Core algorithms
16 | • Hashing and partitioning algorithms
17 | § How to deal with skew?
18 | • Compression algorithms
19 | § Common
20 | § State of the art
21 | § Leverage data patterns for better compression (columnar)
22 | • Caching and eviction policies
23 | § LRU (least recently used)
24 | § Time-based
25 | § LRU w/ priority (risk)
26 | • Ways to store data in cache for fast retrieval
27 | § Similar to Thomas Kesjer's 'grade of the steel' blog posts
28 | § Use Powershell? C#? C++?
29 | • O(1) and O(log N)
30 |
31 | Vector computation - a panacea or not?
32 |
33 | * http://www.extremetech.com/extreme/188776-how-l1-and-l2-cpu-caches-work-and-why-theyre-an-essential-part-of-modern-chips
34 | * http://igoro.com/archive/gallery-of-processor-cache-effects/
35 | * http://www.damninteresting.com/on-the-origin-of-circuits/
36 | * http://www.reedbeta.com/blog/2015/01/12/data-oriented-hash-table/
37 |
38 | Columnar compression - pull out as a CSV, flip into rows using AWK, http://www.unix.com/shell-programming-and-scripting/211181-converting-rows-columns-csv-file.html , and try compressing that way.
--------------------------------------------------------------------------------
/501 - Moore's Law.md:
--------------------------------------------------------------------------------
1 | # Moore's Law
2 |
3 | * http://www.extremetech.com/computing/178529-this-is-what-the-death-of-moores-law-looks-like-euv-paused-indefinitely-450mm-wafers-halted-and-no-path-beyond-14nm#
4 | * What are the implications?
5 | * http://fgiesen.wordpress.com/2014/07/07/cache-coherency/
--------------------------------------------------------------------------------
/502 - Self-Documenting Code.md:
--------------------------------------------------------------------------------
1 | # Self-Documenting Code
2 |
3 | * Is it a myth?
4 | * What does it look like?
5 | * Is a spectrum?
--------------------------------------------------------------------------------
/510 - Analysis of Brilliant People.markdown:
--------------------------------------------------------------------------------
1 | - Analysis of brilliant people
2 |
3 | Look for common traits
4 | Learn from the best.
5 |
6 |
7 | ### Balance
8 |
9 | * Mastery of a skill comes by working for long periods of time.
10 | * Life happens while we make other plans. The unexpected is a fertile source of new ideas.
11 |
12 | These two statements are both true, and also contradictory. It's a struggle to find a good balance.
13 |
14 | Premise: actions speak louder than words. If we want to be better, we should learn from those people who made a big impression in their time.
15 |
16 | What general lessons do they have? What threads are there in common?
17 |
18 | The assumption is that they were more than just smart. They also had a process and lessons that helped translate that intelligence into results.
19 |
20 | "It is necessary for you to learn from others' mistakes. You will not live long enough to make them all yourself." - Hyman G. Rickover
21 |
22 | "The way to tell a great idea is that, when people hear it, they say, 'Gee, I could have thought of that.'" – Feynman, quoted by Townes
23 |
24 |
25 | * http://www.hanselman.com/blog/ScottHanselmansCompleteListOfProductivityTips.aspx
26 | * https://podio.com/site/creative-routines
27 | * http://www.moreintelligentlife.com/content/edward-carr/last-days-polymath
28 | * http://nautil.us/issue/18/genius/super_intelligent-humans-are-coming
29 | * http://nautil.us/issue/18/genius/if-you-think-youre-a-genius-youre-crazy
30 | * http://nautil.us/issue/19/illusions/the-loneliest-genius
31 | * http://seekingintellect.com/2014/12/17/practical-advice-from-leonardo-da-vinci-on-learning-and-honing-your-craft.html
32 | * http://ethanwiner.com/adultbeg.html
33 | * http://www.wired.com/2015/05/inside-ilm/
34 | * http://www.nytimes.com/2015/07/26/magazine/the-singular-mind-of-terry-tao.html
35 | * http://www.brainpickings.org/2015/01/29/music-brain-ted-ed/
36 | * http://nautil.us/blog/how-a-genius-is-different-from-a-really-smart-person
37 |
38 | Lessons learned from Genius
39 |
40 |
41 | Processing Rank:
42 | 1. Einstein
43 | 1. http://higherpayingskills.com/2011/12/how-einstein-got-smart-learning/
44 | 2. Thomas Jefferson
45 | 3. Ben Franklin
46 | 4. Napoleon
47 | 5. Leonardo da Vinci
48 | 6. Tesla
49 | 7. Stephen Hawking
50 | 8. Isaac Newton
51 | 9. Marie Curie
52 | 10. Alan Turing
53 | 11. Thomas Edison
54 | 12. Steve Jobs
55 |
56 |
57 | General savants
58 | Tesla
59 | Teddy Roosevelt
60 | Edison
61 | Napoleon
62 | Thomas Jefferson
63 | Peter the Great
64 | Leonardo da Vinci
65 | Leon Battista Alberti
66 | Aristotle
67 | Archimedes
68 | Omar Khayyam
69 | Frederick II (Frederick the Great)
70 | Albertus Magnus
71 | Ben Franklin
72 | Goethe
73 | Henry Poincare
74 | Physics geniuses
75 | Einstein
76 | Newton
77 | Feynman
78 | Bohr
79 | Stephen Hawking
80 | Galileo Galilei
81 | Rene Descartes
82 | Pascal
83 | Other Manhatten project folks
84 | Marie Curie
85 | Carl Sagan
86 | John von Neumann
87 | Computer geniuses
88 | Turing
89 | Coders At Work
90 | Steve Jobs
91 | Nathan Myhrvold
92 | Herbert A Simon
93 |
94 | Reading Material
95 | Sleep Habits - http://amolife.com/personality/great-people-sleep-less.html
96 |
97 | http://carymillsap.blogspot.com/2014/02/how-did-you-learn-so-much-stuff-about.html?m=1
98 |
99 | I admire the intersection of passion, ethics, and competence.
100 | Turn this into a venn diagram
101 | Passion alone - flailing around
102 | Ethics alone - ivory tower debates
103 | Competence alone - amoral burnout
104 | Passion and ethics, no competence - hippies
105 | Passion and competence, no ethics - CEOs
106 | Ethics and competence, no passion -
107 |
108 | What about how they grew up? What were their influences?
109 |
110 | http://ayearofproductivity.com/top-lessons-learned-a-year-of-productivity/
111 |
112 | http://www.newyorker.com/tech/elements/walking-helps-us-think
113 |
--------------------------------------------------------------------------------
/510 - Brilliant People.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/510 - Brilliant People.png
--------------------------------------------------------------------------------
/510 - Smart People Traits.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/510 - Smart People Traits.xlsx
--------------------------------------------------------------------------------
/520 - healthy foods.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/520 - healthy foods.jpg
--------------------------------------------------------------------------------
/520.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/520.jpg
--------------------------------------------------------------------------------
/530 - 10 commands of architecture.markdown:
--------------------------------------------------------------------------------
1 | # 10 Commandments of System Architecture
2 |
3 | - Blog post idea - 10 commandments of system architecture
4 | Each commandment comes from a truth of the world
5 | Truths of the world are a web
6 | No singletons
7 | Systems must scale
8 | Scale-out is cheaper & more efficient than scale-up
9 | Coupling is bad
10 | Development time is expense
11 | Time is money
12 | Time is short
13 | Interfaces are good things
14 | Design must be easily refactored
15 | 80/20 rule
16 | Enlightened and lazy
17 | Time is money
18 | Systems have inertia
19 | Change is a constant
20 | • Blog post idea - how do we think about systems architecture?
21 | • Architect - needs to think at multiple level of abstractions. Jump between them
22 | § The easier the better
23 |
24 | - http://highscalability.com/blog/2012/2/27/zen-and-the-art-of-scaling-a-koan-and-epigram-approach.html
25 |
26 | ## System Architecture Commandments
27 | Each commandment comes from a truth of the world
28 | Truths of the world are a web
29 | No singletons
30 | Systems must scale
31 | Scale-out is cheaper & more efficient than scale-up
32 | Coupling is bad
33 | Development time is expense
34 | Time is money
35 | Time is short
36 | Interfaces are good things
37 | Design must be easily refactored
38 | 80/20 rule
39 | Enlightened and lazy
40 | Time is money
41 | Systems have inertia
42 | Change is a constant
43 | • Blog post idea - how do we think about systems architecture?
44 | • Architect - needs to think at multiple level of abstractions. Jump between them
45 | § The easier the better
46 |
47 |
48 |
49 | * Software engineering
50 | * Trust, forgiveness, bad behavior & limits
51 | * Boundaries and interfaces
52 | * Abusive behavior
53 | * Conway's Law
54 | Therefore a failure top understand humans is a serious weakness
55 | - Human factors in engineering
56 | • Nonlinear factors
57 | • Non-determinism
58 | • Error rates
59 | • Efficiency
60 |
--------------------------------------------------------------------------------
/540 - Learning and Retention Methods.markdown:
--------------------------------------------------------------------------------
1 | ## Learning and Retention Methods
2 |
3 | Mental capacity (RAM)
4 | Computers (hard drive)
5 | Goals:
6 | * Latency
7 | * Accuracy
8 | * Depth
9 | * Breadth
10 | * Connection
11 |
12 | How people think is important.
13 |
14 | 1-back and 2-back test
15 |
16 | Software Engineers build from their mind. The same way that a construction worker benefits in all sorts of subtle ways from keeping in shape, software engineers benefit from keeping their brains active and healthy.
17 |
18 | Music
19 | Sleep/rest
20 | Meditation
21 | Drugs - like steroids.
--------------------------------------------------------------------------------
/550 - SQL on RDS.markdown:
--------------------------------------------------------------------------------
1 | # An Introduction to SQL Server on Amazon RDS
2 |
3 | > How to calculate IOPS on your current server
4 |
5 | EBS and its limitations
6 |
7 | Planning for failure. AWS forces architecture to a higher standard.
8 |
9 | ### Blog Post Planning
10 |
11 | * Replaces DBAs. More particularly, it moves them up the value chain, to more complicated operations' roles, or more development/business work.
12 | * Put AdventureWorks on the server
13 | * Come up with a mix of CRUD operations for AdventureWorks
14 | * Mix of procs and direct queries. That's normal.
15 | * Use SQLIOSim (or something else?) for load testing.
16 | * Do it from a different EC2 instance in the same region.
17 | * Test the network latency & bandwidth before doing so.
18 | * Run this on multiple different machines in different regions.
19 | * Look at Scalyr to see how many machines they needed for statistical significance (representative sample)
20 | * Compare to SQL Azure
21 | * Run from a different machine in same region
22 | * But first, measure bandwidth and latency
23 | * Use Powershell remoting - learn how first.
24 |
25 | **Number of Tests**
26 |
27 | * Per region, per instance size
28 | * Enough for statistical certainty. Say, 20 each
29 |
30 | **Price comparisons**
31 | * Price per month
32 | * Price per hour
33 | * 99th percentile for each query. Price vs query runtime (inverse) vs concurrency.
34 |
--------------------------------------------------------------------------------
/560 - Balance.markdown:
--------------------------------------------------------------------------------
1 | - Extremes are silly
2 | • Things that are good become bad w/ too much of them
3 | • Balance is necessary
4 | • That's why a straw man is a stupid idea.
5 |
6 | http://www.theatlantic.com/health/archive/2014/06/the-dark-knight-of-the-souls/372766/
7 |
8 |
9 |
--------------------------------------------------------------------------------
/570 - Housing Using Data.md:
--------------------------------------------------------------------------------
1 | # Housing Using Data
2 |
3 | http://dealloc.me/2014/05/24/opendata-house-hunting/
4 | http://www.nytimes.com/2014/07/20/realestate/using-data-to-find-a-new-york-suburb-that-fits.html?_r=0
--------------------------------------------------------------------------------
/600 - Advanced ETL Approaches.markdown:
--------------------------------------------------------------------------------
1 | # Titles
2 |
3 | * You Don't Know ETL
4 | * Dr. ETL Meet Hyde
5 | * Dr. ETL Meet Data Hyde
6 | * Dr. Data Meet ETL Hyde
7 | * **Advanced ETL Using T-SQL**
8 | * ETL Alchemy
9 | * You Can't Handle the ETL
10 |
11 | # Summary
12 |
13 | One of the most common, complicated problems for data professionals is turning oddly-structured data into clean data. In this session we will look at practical, proven ways to solve to complicated data-transformation problems using T-SQL.
14 |
15 |
16 | # Abstract
17 | One of the most common, complicated problems for data professionals is turning oddly-structured data into clean data. This problem is getting more and more common. Data is increasing in size and complexity, and the most efficient ways to analyze it are never the original format.
18 |
19 | In this session we will look at practical, proven ways to solve to complicated data-transformation problems using T-SQL. Examples include denormalizing historical dimensions (Type-2), billing system ETL, the bill-of-materials problem, multithreading interdependent ETL processing, and advanced change detection methods. You'll learn general techniques to tackle any data-transformation problem in your ETL processing.
20 |
21 |
22 | @DevNambi Yes, please, you should still submit your session. Be clear about your main objectives, those will stand out.
23 |
24 |
25 |
26 |
27 | # Session Notes
28 |
29 | Versioning
30 | Fact versioning
31 | aggregation based on type-1 dimensions
32 |
33 | ETL framework
34 | type-2 denormalization
35 | iterator vs. CBL
36 | ETL parallelism
37 | digraph
38 | 'Real-time' ETL vs. not
39 | Comes with a warning.
40 | Retention trade-off
41 | No aggregation trade-off
42 |
43 | Type-1 denormalization
44 | Type-2 denormalization
45 | Change detection
46 |
47 | Relational for things it's not designed for
48 | Joint courses ETL
49 |
50 | Evaluate tools to see if they do this
51 |
52 |
53 | ## Tools
54 | Powershell
55 | T-SQL
56 | SSIS
57 | Pitch the SSIS extensions
58 | Hadoop
59 | Python
60 | C# - *not* a good idea
--------------------------------------------------------------------------------
/610 - ETL tips and Tricks.markdown:
--------------------------------------------------------------------------------
1 | # ETL Tips and Tricks
2 |
3 | Logging
4 | 80/20 rule for bottlenecks
5 | Parallelism
6 | Amdahl's law
7 | Load frequency wags the dog.
8 | ETL 'frameworks' & not-invented-here syndrome
9 |
10 | Difficulty = Data Volume X Load Frequency X Types of ETL / Talent^2
11 |
12 | ## What is best for Hadoop / Hive / Pig / Cascading / PoSH?
13 |
14 | ### Do a comparison-contrast
15 |
--------------------------------------------------------------------------------
/620 - Data Science Intro.markdown:
--------------------------------------------------------------------------------
1 | # Titles
2 | * Data Science: Field of Vision
3 | * Data Science: Beyond the Hype
4 | * Data Science: Beyond the Hype Cycle
5 | * Machine Learning for Mere Mortals
6 |
7 | # Summary
8 |
9 | Machine learning is a way to find meaning in data. This is a fun and gentle introduction to the world of machine learning. You'll learn to implement common techniques in T-SQL and solve everyday problems.
10 |
11 | # Abstract
12 |
13 | Machine learning is a hybrid of computer science and math. It's used everywhere: web search (Google, Bing), recommendation engines (Netflix, Amazon, LinkedIn), computational vision (self-driving cars), and natural language processing (Google Translate, Klout).
14 |
15 | The basics of machine learning are simple. You don't need to be a level 18 data scientist to use machine learning to solve problems.
16 |
17 | Join fellow data geek Dev Nambi in this fun and gentle introduction to the world of machine learning. We will cover common techniques such as clustering, supervised vs unsupervised learning, and learning at scale. Finally, you'll learn how to implement common machine learning techniques in T-SQL.
18 |
19 | ### Abandoned Abstracts
20 |
21 | 'Data Science' is the all the rage these days. Most of the people I've spoken with are hesitant, probably because they weren't good at math.
22 |
23 | All of the techniques in data science are pretty intuitive once you see what they're about.
24 |
25 | This session will be a fast introduction to the world of data science.
26 |
27 | We'll look at the software side of things, including feature extraction and rapid prototyping.
28 |
29 | We'll look at the business side of things, including
30 |
31 | We'll also dive into its use, including clustering, recommender systems, natural language processing, and computer vision.
32 |
33 | Machine learning is the science of building predictive models from available data, in order to predict the behavior of new data.
34 |
35 | # Content
36 |
37 | ## Business
38 |
39 | ### Story-telling
40 |
41 | ###
42 |
43 | ## Math
44 |
45 | ### Machine Learning
46 |
47 | ### Statistics
48 |
49 | ## Engineering
50 |
51 | Development
52 | 'Big Data'
53 | Optimization
54 | It's all about scripting
55 | Quick and dirty is the point.
56 |
57 | ## Common Perspectives
58 |
59 | It's about the scientific method
60 | You don't know what's going to work ahead of time
61 | Experience makes you stop asking stupid questions. But you can get jaded
62 | Curiosity helps you ask stupid questions.
63 |
64 |
65 | ## Applications
66 |
67 | Natural language processing
68 | Searching for correlations
69 | Grouping together alike objects
70 | Pricing
71 | Behavior
72 | Identifying unexpected relationships
73 | System linkages (splunk)
74 | Purchasing behavior (retailers, Amazon)
75 | People (dating, LinkedIn, recruiter)
76 | Music (Pandora)
77 | Movies (Netflix)
78 |
79 |
80 |
81 | Machine learning / data mining
82 |
--------------------------------------------------------------------------------
/621 - photo.JPG:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/621 - photo.JPG
--------------------------------------------------------------------------------
/640 - Making Data Friendly Organizations.markdown:
--------------------------------------------------------------------------------
1 |
2 | * Well rounded data scienctists are pretty rare.
3 | * Managers are thinking holistically (type 2)
4 | * Scientists are thinking more tactically (type 1)
5 |
6 | This was an exercise in feature eng and exploratory data analysis
7 | * Barga wanted to learn something about the members of this class
8 | * Expected to see natural clusters of profiles
9 | * How would you measure similarity?
10 | * "We want to understand who our customers are, how they use our product."
11 | * You have to start w/ some features, initial criteria for calibrating
12 |
13 | What would I do to improve the process for a subsequent round?
14 | * A student learning this stuff has a different scale
15 | * How do we define expert and could we infer from other data?
16 | * How would it mean to standardize the scale?
17 | * What features would you add?
18 |
19 | *You will always screw up cohort clustering the first time*
20 |
21 | ### Next Few Years
22 | * Data Science over the next few years will be darwinistic.
23 | * Companies that can be data-driven will thrive.
24 | * Others will die
25 | * Data Science as a C-level position. Strategic decisions about data will be C-level decisions w/ a management chain
26 |
27 | ? Can you build data creativity as a muscle?
--------------------------------------------------------------------------------
/650 - Data To Decisions Education Abstract.html:
--------------------------------------------------------------------------------
1 | Colleges, Majors and Tuition - using data make decisions
124 |
125 | Step 1: Ask questions
126 |
127 | Step 2: Look at data
128 |
129 | Step 3: Profit
130 |
131 | Data is growing faster than ever. Anyone who can use data to make decisions has a big advantage and is in high demand.
132 |
133 | Join fellow data geek Dev Nambi (@DevNambi) and learn how to answer thorny questions about picking a college, analyzing majors, and looking at tuition. We'll use clever questions, free data, and common tools like Excel, T-SQL and Powershell.
134 |
135 | You'll also learn general techniques to make sound data-based decisions for any problem.
136 |
--------------------------------------------------------------------------------
/650 - Data to Decisions Ed abstract.md:
--------------------------------------------------------------------------------
1 | ### Colleges, Majors and Tuition - using data make decisions
2 |
3 | *Step 1: Ask questions*
4 |
5 | *Step 2: Look at data*
6 |
7 | *Step 3: Profit*
8 |
9 | Data is growing faster than ever. Anyone who can use data to make decisions has a big advantage and is in high demand.
10 |
11 | Join fellow data geek Dev Nambi (@DevNambi) and learn how to answer thorny questions about picking a college, analyzing majors, and looking at tuition. We'll use clever questions, free data, and common tools like Excel, T-SQL and Powershell.
12 |
13 | You'll also learn general techniques to make sound data-based decisions for any problem.
--------------------------------------------------------------------------------
/700 - autotrader_scrape.py:
--------------------------------------------------------------------------------
1 | from bs4 import BeautifulSoup
2 | from urllib2 import urlopen
3 | from time import sleep # be nice
4 | import re
5 |
6 | BASE_URL = 'http://www.autotrader.com'
7 |
8 | def f7(seq): # de-duplication function
9 | seen = set()
10 | seen_add = seen.add
11 | return [ x for x in seq if x not in seen and not seen_add(x)]
12 |
13 | def make_soup(url):
14 | return BeautifulSoup(urlopen(url).read(), "lxml")
15 |
16 | def get_links(url):
17 | soup = make_soup(url)
18 | links = [BASE_URL + link['href'] for link in soup.find_all('a', href=re.compile('vehicledetails'))]
19 | return links
20 |
21 | def get_details(url):
22 | soup = make_soup(url)
23 | table = soup.find('table', class_='vehicle-stats')
24 | atid = table.find('td',text='AT Car ID:').next_sibling.get_text()[:11]
25 | price = table.find('span', class_='primary-price').get_text()
26 | mileage = table.find('td',text='Mileage').next_sibling.get_text()
27 | body = table.find('td',text='Body Style').next_sibling.get_text()
28 | color = table.find('td',text='Exterior Color').next_sibling.get_text()
29 | drive = table.find('td',text='Drive Type').next_sibling.get_text()
30 | fuel = table.find('td',text='Fuel Type').next_sibling.get_text()
31 | doors = table.find('td',text='Doors').next_sibling.get_text()
32 | return {"atid": atid,
33 | "price": price,
34 | "mileage": mileage,
35 | "body": body,
36 | "color": color,
37 | "drive": drive,
38 | "fuel": fuel,
39 | "doors": doors}
40 |
41 | if __name__ == '__main__':
42 |
43 | url = 'http://www.autotrader.com/cars-for-sale/searchresults.xhtml?zip=98103&Log=0&modelCode1=CTS&makeCode1=CAD&searchRadius=25&mmt=%5BCAD%5BCTS%5B%5D%5D%5B%5D%5D&showcaseListingId=353441599&showcaseOwnerId=100016026&captureSearch=true&showToolbar=true&Log=0'
44 |
45 | links = get_links(url)
46 | links = f7(links) # de-dupe
47 | print(len(links))
48 | for link in links:
49 | data = get_details(link)
50 | print data
51 | sleep(1) # be nice
52 |
53 |
--------------------------------------------------------------------------------
/9900 - Cloud Uploads.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - Cloud Uploads.jpg
--------------------------------------------------------------------------------
/9900 - Graphical Models.PDF:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - Graphical Models.PDF
--------------------------------------------------------------------------------
/9900 - IT_roles.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - IT_roles.jpg
--------------------------------------------------------------------------------
/9900 - commit linkbait.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - commit linkbait.jpg
--------------------------------------------------------------------------------
/9900 - complexity kills.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - complexity kills.jpg
--------------------------------------------------------------------------------
/9900 - complexity.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - complexity.jpg
--------------------------------------------------------------------------------
/9900 - devops and security.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - devops and security.jpg
--------------------------------------------------------------------------------
/9900 - enterprise-it.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - enterprise-it.png
--------------------------------------------------------------------------------
/9900 - git undo flowchart.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - git undo flowchart.png
--------------------------------------------------------------------------------
/9900 - ie-must-die.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - ie-must-die.jpg
--------------------------------------------------------------------------------
/9900 - javascript.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - javascript.png
--------------------------------------------------------------------------------
/9900 - linux perf tools.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - linux perf tools.jpg
--------------------------------------------------------------------------------
/9900 - multithreading.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - multithreading.jpg
--------------------------------------------------------------------------------
/9900 - programmer_style.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - programmer_style.png
--------------------------------------------------------------------------------
/9900 - programming spec.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - programming spec.jpg
--------------------------------------------------------------------------------
/9900 - reading software.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - reading software.png
--------------------------------------------------------------------------------
/9900 - software-engineer.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - software-engineer.jpg
--------------------------------------------------------------------------------
/9900 - stackoverflow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - stackoverflow.png
--------------------------------------------------------------------------------
/9900 - wicked problems.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9900 - wicked problems.jpg
--------------------------------------------------------------------------------
/9901 - GDP vs GNH.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9901 - GDP vs GNH.jpg
--------------------------------------------------------------------------------
/9901 - Smartphone Crossing.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9901 - Smartphone Crossing.jpg
--------------------------------------------------------------------------------
/9901 - learning stages.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9901 - learning stages.jpg
--------------------------------------------------------------------------------
/9901 - profanity motivation.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9901 - profanity motivation.jpg
--------------------------------------------------------------------------------
/9903 - CEO streamlining.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9903 - CEO streamlining.jpg
--------------------------------------------------------------------------------
/9903 - Robots and labor.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9903 - Robots and labor.jpg
--------------------------------------------------------------------------------
/9903 - counter-Varian Rule.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9903 - counter-Varian Rule.jpg
--------------------------------------------------------------------------------
/9903 - trickle down economics.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9903 - trickle down economics.jpg
--------------------------------------------------------------------------------
/9904 - 2016-12-9-gans.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - 2016-12-9-gans.pdf
--------------------------------------------------------------------------------
/9904 - Big Data Deities.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - Big Data Deities.png
--------------------------------------------------------------------------------
/9904 - Overfitting diagram.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - Overfitting diagram.jpg
--------------------------------------------------------------------------------
/9904 - RoadToDataScientist1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - RoadToDataScientist1.png
--------------------------------------------------------------------------------
/9904 - Scikit_Learn_Cheat_Sheet_Python.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - Scikit_Learn_Cheat_Sheet_Python.pdf
--------------------------------------------------------------------------------
/9904 - data science funny.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - data science funny.jpg
--------------------------------------------------------------------------------
/9904 - data science over time.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - data science over time.png
--------------------------------------------------------------------------------
/9904 - data science skills venn.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - data science skills venn.jpg
--------------------------------------------------------------------------------
/9904 - data viz.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - data viz.png
--------------------------------------------------------------------------------
/9904 - data-science-venn-diagram.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - data-science-venn-diagram.jpg
--------------------------------------------------------------------------------
/9904 - machine learning industry.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - machine learning industry.png
--------------------------------------------------------------------------------
/9904 - ml libraries.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - ml libraries.png
--------------------------------------------------------------------------------
/9904 - never use piece charts.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - never use piece charts.jpg
--------------------------------------------------------------------------------
/9904 - stats-trick-question.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - stats-trick-question.jpg
--------------------------------------------------------------------------------
/9904 - storytelling.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - storytelling.jpg
--------------------------------------------------------------------------------
/9904 - tools.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904 - tools.jpg
--------------------------------------------------------------------------------
/9904.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9904.jpg
--------------------------------------------------------------------------------
/9905 - Parenting Chores over Time.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9905 - Parenting Chores over Time.jpg
--------------------------------------------------------------------------------
/9905 - Parenting Iron Triangle.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9905 - Parenting Iron Triangle.jpg
--------------------------------------------------------------------------------
/9905 - money and time.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9905 - money and time.jpg
--------------------------------------------------------------------------------
/9905 - no hipsters.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9905 - no hipsters.jpg
--------------------------------------------------------------------------------
/9905 - why people become unhappy.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9905 - why people become unhappy.jpeg
--------------------------------------------------------------------------------
/9906 - Security Links.md:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9906 - Security Links.md
--------------------------------------------------------------------------------
/9906 - time to crack password.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9906 - time to crack password.png
--------------------------------------------------------------------------------
/9907 - cheese wheel.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9907 - cheese wheel.jpg
--------------------------------------------------------------------------------
/9907 - dentist prices.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9907 - dentist prices.pdf
--------------------------------------------------------------------------------
/9907 - growth of hospital admins.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9907 - growth of hospital admins.png
--------------------------------------------------------------------------------
/9907 - overweight.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9907 - overweight.jpg
--------------------------------------------------------------------------------
/9907 - recipe recommendation ML.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9907 - recipe recommendation ML.pdf
--------------------------------------------------------------------------------
/9907 - vaccines.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9907 - vaccines.gif
--------------------------------------------------------------------------------
/9908 - Academia Misincentives.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - Academia Misincentives.jpg
--------------------------------------------------------------------------------
/9908 - College and Career.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - College and Career.jpg
--------------------------------------------------------------------------------
/9908 - NFL odds.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - NFL odds.jpg
--------------------------------------------------------------------------------
/9908 - academic minions.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - academic minions.jpg
--------------------------------------------------------------------------------
/9908 - education retention.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - education retention.jpg
--------------------------------------------------------------------------------
/9908 - game of loans.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - game of loans.jpg
--------------------------------------------------------------------------------
/9908 - goal of education.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - goal of education.jpg
--------------------------------------------------------------------------------
/9908 - incentives.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - incentives.jpg
--------------------------------------------------------------------------------
/9908 - teacher feedback funny.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - teacher feedback funny.jpg
--------------------------------------------------------------------------------
/9908 - textbooks.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908 - textbooks.png
--------------------------------------------------------------------------------
/9908.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9908.jpg
--------------------------------------------------------------------------------
/9909 - Education Reform Warnings.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9909 - Education Reform Warnings.pdf
--------------------------------------------------------------------------------
/9909 - Skunk Works Leadership.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9909 - Skunk Works Leadership.png
--------------------------------------------------------------------------------
/9909 - get out of the way.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9909 - get out of the way.jpg
--------------------------------------------------------------------------------
/9909 - org charts.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9909 - org charts.jpg
--------------------------------------------------------------------------------
/9909 - typical conversation with managers.webm:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9909 - typical conversation with managers.webm
--------------------------------------------------------------------------------
/9910 - mvp.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9910 - mvp.png
--------------------------------------------------------------------------------
/9910 - sick burn by new yorker.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9910 - sick burn by new yorker.jpg
--------------------------------------------------------------------------------
/9911 - CBP Task Group Out-brief Slides_FINAL.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9911 - CBP Task Group Out-brief Slides_FINAL.pdf
--------------------------------------------------------------------------------
/9911 - ComparisonOfVotingSystems.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9911 - ComparisonOfVotingSystems.png
--------------------------------------------------------------------------------
/9911 - Terrorism causes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9911 - Terrorism causes.png
--------------------------------------------------------------------------------
/9911 - police and recording.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9911 - police and recording.jpg
--------------------------------------------------------------------------------
/9913 - Net Neutrality.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9913 - Net Neutrality.png
--------------------------------------------------------------------------------
/9913 - coca cola.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9913 - coca cola.png
--------------------------------------------------------------------------------
/9913 - misbehaving.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9913 - misbehaving.jpg
--------------------------------------------------------------------------------
/9914 - privacy vs security.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9914 - privacy vs security.jpg
--------------------------------------------------------------------------------
/9915 - dont shoot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9915 - dont shoot.png
--------------------------------------------------------------------------------
/9916 - how to survive police encounters.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/9916 - how to survive police encounters.jpg
--------------------------------------------------------------------------------
/Archive/140 - Nonprofit_Revenue_-_Donation_Cannibalization.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/Archive/140 - Nonprofit_Revenue_-_Donation_Cannibalization.pdf
--------------------------------------------------------------------------------
/Archive/2014-05-08-keynote-one.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-05-08
4 | layout: post
5 | slug: pass-analytics-keynote
6 | title: PASS Business Analytics Thursday Keynote
7 | meta-description:
8 | - pass
9 | - keynote
10 | - microsoft
11 | - sqlpass
12 | - passbac
13 | ---
14 |
15 | The [PASS Business Analytics conference](http://www.sqlpass.org/bac/2014/Home.aspx) (ADD IMAGE) has started. I had the privilege of watching the [keynote](http://www.sqlpass.org/bac/2014/Sessions/Keynote.aspx) with 600-700 fellow data nerds.
16 |
17 | I was also playing [Data Science Bingo](https://github.com/tdhopper/Data-Science-Conference-Bingo).
18 |
19 | Good lord, too many pie charts.
20 |
21 | Everyone wants to get more out of their data.
22 |
23 | Of course, we started off with Tom LaRock (t/b).
24 |
25 | "We get paid to work with data every day" (Tom LaRock)
26 |
27 | ### Getting Involved
28 |
29 | We're a bunch of like-minded pros.
30 |
31 | Virtual Chapters
32 |
33 | SQL Saturdays
34 |
35 | There's a niche for you.
36 |
37 | Next year - April 20-22 in Santa Clara.
38 |
39 | ### "Big Data, Predictive Analytics, and the Middle Market"
40 |
41 | [John Whittaker](https://twitter.com/alertsource)
42 |
43 | Sr. Dir. IM from Dell Software
44 |
45 | 300 respondents, 96% said that they had big-data projects in flight or were to launch them this year.
46 |
47 | Budgets range around 2-5 million. Within 2 years they'll be at the $6mil level.
48 |
49 | http://www.dell.com/learn/us/en/uscorp1/press-releases/2014-04-28-dell-software-big-data-midmarket-survey
50 |
51 | **What drives success?**
52 |
53 | * 41% strong cooperation between businss and IT
54 | * 37 - strong connection between data analystics and perf mgmt
55 | * Required skills - data science - 33
56 | * Bus req complete/accurate - 32
57 | * Server/storage capacity -30%
58 | * Datacenter tools capable 29%
59 |
60 | Unknown -
61 |
62 | Very easy to be clever w/ predictive analytics. Just as easy to be creepy with it.
63 |
64 | **What are the most useful tools?**
65 |
66 | 60% real-time processing
67 | 58% predictive
68 | 56% data viz to convert processed daa into actionable insights
69 | 56% cloud computing for lower cost
70 |
71 |
72 | ME - get the keynote slides!
73 |
74 | **Main challenge**
75 |
76 | Complexity, volume, budget
77 |
78 | Data complexity - where it is, cleaning it, etc is still one of the big unsolved challenges today.
79 |
80 | 50% of organizations with big data projects in flight are satisfied with their decision making (speed quality), compared to just 23% among those yet to kick off a project.
81 |
82 | Who should decide: split between IT and biz.
83 |
84 |
85 | ### Bingo Words
86 |
87 | * "Big Data"
88 | * "Complexity"
89 | * "Volume of data"
90 | * "Real time"
91 | * "Hadoop"
92 | * "NoSQL"
93 |
94 |
95 | Star Trek Redshirt Bayesian -
96 |
97 |
98 | # Amir Netz, Kamal Hathi
99 |
100 | Amir - chief designer for data platform
101 | Kamal (@kamalh) - director of engineering for BI
102 |
103 | We have an engineer as a MSFT CEO - is it soon enough?
104 |
105 | Talk about Microsoft culture.
106 |
107 |
108 | 2 mil power pivot
109 | 100K power query
110 | 55K power map
111 | only looking at downloads
112 | HDinsight - 100mil compute hours
113 | powerBI 365 - 12.5K tenants activated (companies)
114 | no 1 market share gain 2013 Gartner BI market share report.
115 |
116 | We've gone meta
117 |
118 | Tenants by date - up to around 12K tenants. Nice growth chart.
119 |
120 | 1,091K questions answered by Q&A last month.
121 |
122 | What kinds of features - most-used is auto-complete.
123 |
124 | "More tweets = more wins" for NBA finals. Not looking at population sentiment.
125 |
126 | iOS app for PowerBI will be available this summer. Native applications. No comments about Android yet, but it seems likely.
127 |
128 | Please speak about authentication for BI data on apps and cloud platforms. Active Directory, etc.
129 |
130 | SSRS running in PowerBI. Going to be available by the end of the summer. Natively integrated. Taking care of all of the infrastructure. Connects to on-premise data source. OK, that's pretty awesome.
131 |
132 | What about security?
133 |
134 | ### The Changing Face of BI
135 |
136 | First - all in IT. 6-7 years ago it wasn't working.
137 |
138 | 6-7 years ago: self-service, focused on analysts and power users. PowerPivot, etc. Self-service BI.
139 |
140 | Data Culture - give everybody the tools to satisfy their curiosity. This can be done in two ways: dumbing things down, elevating things up.
141 |
142 |
143 | ### Ways of Interacting with Data
144 |
145 | * Speed
146 | * Accuracy
147 | * Semantic meaning
148 |
149 | The approach is still dashboards and professionals, not products.
150 |
151 |
152 | "How to make data science so easy that anyone can do it?"
153 |
154 | Forecasting in Power View in Office 365 - simple forecasting.
155 |
156 | Going to be available in every line chart.
157 | Does seasonality and confidence interval.
158 | Confidence interval's based on standard deviation. Assume normal distribution.
159 |
160 | ME - do this w/ Tableau forecasting, PowerBI for Notify data.
161 |
162 | Data exploration mode- will be available, no date specified.
163 |
164 | Time series analysis. Does time-series cross-validation to learn how. That's pretty awesome.
165 |
166 | ME - show housing prices.
167 |
168 | Now there are treemaps.
169 |
170 |
171 | Talk about information retrieval.
172 |
173 |
174 |
175 |
176 |
--------------------------------------------------------------------------------
/Archive/2014-05-08-keynote-two.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-05-09
4 | layout: post
5 | slug: pass-analytics-mccandless
6 | title: PASS Business Analytics Friday Keynote
7 | meta-description:
8 | - pass
9 | - keynote
10 | - microsoft
11 | - sqlpass
12 | - passbac
13 | - visualization
14 | ---
15 |
16 | Today was the last day of the [PASS Business Analytics conference](http://www.sqlpass.org/bac/2014/Home.aspx) (ADD IMAGE).
17 |
18 | The keynote today was fun; David McCandless (blog, twitter) .
19 |
20 |
21 | ### Denise
22 |
23 | * Amount of info is overwhelming.
24 | * How do you find the patterns that matter?
25 |
26 |
27 | ## David McCandless
28 |
29 | informationisbeautiful.net
30 |
31 | * What does billions look like.
32 | * Too large to really understand. Billion dollar-o-gram. Color are
33 | * The story is the connections between normally unconnected data.
34 | * Cost per taxpayer per day. It's a number and scale we can relate to.
35 | * Journalism tends to feed on fear. Yellow journalism.
36 | * Timeline of the world's biggest fears.
37 |
38 | $148 billion. Spent on obesity-related illness.
39 |
40 | Data is the new soil. Things grow from them.
41 |
42 | You need journalistic inquiry to deliver discovery and delivery.
43 |
44 | Break-up times: never on Xmas,
45 |
46 | Web scraping. David McCandless.
47 |
48 | PICTURE - most common
49 |
50 | Remember to put numbers in context.
51 |
52 | A million lines of code.
53 |
54 | Visual drake equation
55 |
56 | PICTURE - visual resume.
57 |
58 | We learn different things by osmosis.
59 |
60 | Our brains interact in visual color, pattern and shape. It's the language of our brain. 75% of neurons (check that).
61 |
62 | Venn diagram - pigs, birds, humans - in-flu-venza. Wow, that's a terrible pun.
63 |
64 | **Twitter IPO**
65 |
66 | 100 people
67 | 20 are dead
68 | 60 lazy - not in last week
69 | 5 with more than 10- followers
70 | 5 loud mouths - 32% are bots
71 | 55 women, 45 men.
72 |
73 |
74 | 200 billion hours TV. 100 million hours to create Wikipedia.
75 |
76 | his data is all public. Use it.
77 |
78 | Food supplement by efficacy.
79 |
80 | **Fail Tips**
81 |
82 | * When you visualize a complex data set, you make a complex graphic. Doesn't solve it.
83 | * Circular diagrams aren't that usable.
84 | * Cartograms...hard to get the data out. Very hard to compare.
85 | * Design is really about removing things, cleaning down to a functional essence.
86 |
87 | **What Works**
88 |
89 | * Interestingness - goes after a useful question
90 | * Integrity - trustworthy
91 | * Form - has to look good, certain standard.
92 | * Function - easy to use
93 |
94 |
95 | * Data viz smackdown
96 | * Hard to write - so engrossing. Mark of a brilliant keynote.
97 |
98 | * All datasets are public on Google Docs. They spend a lot of time putting it together.
99 |
100 |
101 |
102 |
103 |
104 |
105 |
106 |
107 |
108 |
109 |
110 |
111 |
112 |
113 |
114 |
115 |
116 |
117 |
118 |
119 |
120 |
121 |
122 |
123 |
124 |
125 |
--------------------------------------------------------------------------------
/Archive/2014-05-13-passbac survey.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/Archive/2014-05-13-passbac survey.xlsx
--------------------------------------------------------------------------------
/Archive/2014-07-01-tsql-tuesday.md:
--------------------------------------------------------------------------------
1 | ---
2 | author: DevNambi
3 | date: 2014-07-01
4 | layout: post
5 | slug: tsql-tuesday-announcement
6 | title: T-SQL Tuesday - Assumptions
7 | meta-description:
8 | - tsql tuesday
9 | - sqlfamily
10 | - sqlpass
11 | ---
12 |
13 |
14 |
15 | It's hard to rock the boat.
It's hard to ask the basic questions that everybody knows.
It's hard to slow down and ask for clarification.
16 |
17 | So, we improvise. We guess: things that are accepted as true, without proof. We often forget our assumptions, or make them instinctively.
18 |
19 | For this T-SQL Tuesday, the topic is **assumptions**.
20 |
21 |
22 | For example:
23 |
24 | * The sun will come up tomorrow.
25 | * Firewalls and anti-virus are enough to protect my computer.
26 | * My backups work even if I don't restore them.
27 | * I don't need to check for *that* error, it'll never happen.
28 |
29 |
30 |
31 | Your assignment for this month is to write about a big assumption you encounter at work, one that people are uncomfortable talking about. Every team has an [elephant in the room](http://en.wikipedia.org/wiki/Elephant_in_the_room).
32 |
33 |
34 | **What happens if these big guesses aren't true?**
35 |
36 |
37 | #### Housekeeping
38 |
39 | A few rules to follow when participating:
40 |
41 | * Your post must be published between **00:00 [PDT](http://www.timeanddate.com/library/abbreviations/timezones/na/pdt.html) Tuesday, July 8th, 2014**, and **00:00 PDT Wednesday, July 9, 2014**.
42 | * Your post must contain the T-SQL Tuesday logo from above and the image should link back to this blog post.
43 | * Trackbacks won't work, so please tweet a link to me ([@DevNambi](https://twitter.com/DevNambi)) or send an email (me at devnambi dot com).
44 |
45 |
46 | Some optional (and highly encouraged) things to also do:
47 |
48 | * Include a reference to T-SQL Tuesday in the title of your post
49 | * Tweet about your post using the hash tag [#TSQL2sDay](https://twitter.com/search?q=%23tsql2sday)
50 | * Consider hosting T-SQL Tuesday yourself. Adam Machanic keeps the list.
51 |
52 |
53 | #### About T-SQL Tuesday
54 | T-SQL Tuesday was started by Adam Machanic ( [Blog](http://sqlblog.com/blogs/adam_machanic/) | [@AdamMachanic](https://twitter.com/AdamMachanic) ) in 2009. It’s a monthly blog party with a rotating host, who is responsible for providing a new topic each month. In case you've missed a month or two, Steve Jones ( [Blog](http://voiceofthedba.wordpress.com/2012/12/10/t-sql-tuesday-topics-december-2012-update/) | [@way0utwest](https://twitter.com/way0utwest) ) maintains a complete list for your reading enjoyment.
55 |
56 |
57 | *Happy sleuthing!*
--------------------------------------------------------------------------------
/Archive/NodeXL graphs.md:
--------------------------------------------------------------------------------
1 | # NodeXL
2 |
3 | * Crowds matter online...they're larger than real life often, but we understand them less. Inherently weak signal.
4 |
5 | ## Social Networks
6 |
7 | * Central tenet - social structure emerges from the aggregate of relationships among members of a population.
8 | * Emergence of cliques and clusters. Centrality (core) and periperhy (isolates), betweenness.
9 | * Methods - surveys, interviews, etc.
10 | * Social media is all about networks.
11 |
12 | Patterns are left behind.
13 |
14 | There are many kind of ties:
15 |
16 | * Send
17 | * mention
18 | * like link reply rate review favorite friend follow forward edit tag comment check-in.
19 | * one way relationships: lend money to.
20 | * bidirectional: is married to.
21 |
22 | Social media is meaningfully different from each other. They all have one thing in common: networks.
23 |
24 | The UW doesn't have public squares anymore, with people who disagree with us. If it happens at all it happens online.
25 |
26 | A network is born whenever two entities are joined.
27 |
28 | Network theory: position, position, position. It's all relative.
29 |
30 | NodeXL - like social media for graphs.
31 |
32 | Trying to be the Firefox of GraphML.
33 |
34 | GraphML - XML for social networks (a data structure)
35 |
36 | Open Tools, Open Data, Open Scholarship.
37 |
38 | NodeXLGraphGallery.org - open data, user-generated collections/datasets.
39 | Open Scholarship - trying to make it easy.
40 |
41 | Try to using the tool.
42 |
43 | ### 6 social network structures
44 |
45 | Divided or unified crowds
46 | Divided - political/controversial topic.
47 | United - some communities are unified.
48 |
49 | Fragmented - brand clusters
50 | they don't reply to each other.
51 | Clustered - community clusters
52 | they interact a bit.
53 | what happens when people grow up a bit.
54 | Hub-and-spoke - broadcast network
55 | PR/marketing.
56 | Institutional speaker.
57 | Called the 'audience' pattern - people who retweet don't interact with each other.
58 | Out-hub-and-spoke - support network
59 | Airline support.
60 | @DellCares
61 |
62 | The density of the connections is how
63 |
64 |
65 | ## Centrality
66 |
67 | * Eigenvector centrality.
68 | * PageRank
69 | * Betweenness centrality - influencers. The 'bridge' score.
70 | ME - look at this for side business.
71 |
72 | * Some connections are very important. Bridges. Only 2 points of connection. But they're the only thing that connects those two networks.
73 |
74 | When you are the bridge, you may charge a toll. It could be only social capital. It's hard because you connect to something that is not like you.
75 |
76 | Don't be a hub. Be a bridge.
77 |
78 | Isolets. It means there's never been an @____ in their tweets. It means they're the new members.
79 |
80 | IDEA FOR PASS: use social network analysis to identify influencers and new people to connect with.
81 |
82 | #CMgrChat - social media managers. Basically it's a small village.
83 |
84 | Look at the social network of people who are better at this than you. Find out, and then use this analysis to figure it out.
85 |
86 | ME - read more of stuff by Marc Smith, MSR researcher
87 |
88 | http://www.connectedaction.net/
89 |
90 | Last - plea for help.
91 |
92 |
93 | Because Excel is an ODBC sourcer, anything that can join 2 tables can work in NodeXL.
94 |
95 |
96 |
97 |
98 |
99 |
--------------------------------------------------------------------------------
/Archive/uw - 010 - introduction.md:
--------------------------------------------------------------------------------
1 | # The Bootstrap
2 |
3 |
4 | This is the first post in a blog on data analysis, data-driven discovery, and decision making at the University of Washington.
5 |
6 | My name's Dev Nambi, and I'm a data scientist in the UW's [Enterprise Data and Analytics](http://www.washington.edu/uwit/im/EDA.html) team. (ADD PIC) I've worked at the UW since 2012. Before that I was a software developer and analyst at [Microsoft's Ads](http://advertising.microsoft.com/en/advertising-online) R&D group, an ETL developer at a startup, and [more](http://devnambi.com).
7 |
8 | *"The best minds of my generation are thinking about how to make people click ads..." - [Jeff Hammerbacher](https://twitter.com/hackingdata)*
9 |
10 | That was me. It the best job I could find that paid enough to let me work off my student loans. Now I am hoping to give back to the next generation of students at the university.
11 |
12 |
13 | ### Data Science in Academia
14 |
15 | There is quite a bit of [excitement](https://news.cs.washington.edu/2013/11/12/uw-berkeley-nyu-collaborate-on-37-8m-data-science-initiative/) and [activity](http://escience.washington.edu/event/data-science-university-washington-campus-conversation) on data science in academia. So far the emphasis has been, rightly, on data-driven discovery in *scientific research*. The UW's emphasis there is its new [Data Science Incubator](http://data.uw.edu) and [eScience Institute](http://escience.washington.edu/).
16 |
17 | There are potentially far-reaching implications in fields as varied as astrophysics, oceanography, chemical engineering, genomics, and sociology.
18 |
19 | I admire that, but I want to do something more pragmatic, more *direct*.
20 |
21 |
22 | ### Data Science in Administration
23 |
24 | A university can be made more efficient using data. There are so many ways to do this it's mind-boggling, so I use a heuristic to pick areas to focus on: changes must directly benefit students.
25 |
26 | My reason for starting with students is simple: money.
27 |
28 | (ADD "the tuition is too damn high meme").
29 |
30 |
31 | Tuition is very expensive [compared to entry-level salaries](http://www.zerohedge.com/news/2014-05-18/net-worth-college-grads-student-debt-20-less-high-school-grads-no-debt), and that problem has been getting worse for *[decades](http://measuringup2008.highereducation.org/commentary/callan.php)*. Student debt is now [bigger than credit card debt](http://www.bizjournals.com/stlouis/blog/2013/04/fed-student-loan-debt-surpasses-auto.html), and it's [practically impossible to get rid of](http://www.studentloanborrowerassistance.org/bankruptcy/).
32 |
33 | My goal is simple: to find ways to help UW students graduate with the same quality education they have now, but with less debt.
34 |
--------------------------------------------------------------------------------
/Genome Science Blog Post.md:
--------------------------------------------------------------------------------
1 | # Genome Science Blog Post
2 |
3 | https://aws.amazon.com/blogs/big-data/interactive-analysis-of-genomic-datasets-using-amazon-athena/
--------------------------------------------------------------------------------
/List of things I still can't do in November 2014.md:
--------------------------------------------------------------------------------
1 | ### List of things I still can't do in November 2014
2 |
3 | -- from https://www.facebook.com/frkrueger/posts/10154867033210444
4 |
5 |
6 | **Legit**
7 |
8 | * Get a blood test simply, myself, without going to a lab, and get the results overnight.
9 | * Build a house myself using standard small interchangeable parts like legos.
10 | * Build a website with a resolvable domain on the internet, a great theme, and configurable data inside of 5 minutes.
11 | * Locate where my dog is right now using a barely noticeable GPS chip.
12 | * Share my health and fitness data with my doctor and my trainer in real time and get advice.
13 | * Have my car keep a web record of everywhere it has been, how it is doing, and what needs fixing or updating.
14 | * Find a list of all LA artists online, browse their work, and buy directly from them without going through an art dealer.
15 | * Vote online.
16 | * Get a doctors house call using an app. (Uber for doctors).
17 | * Move $50,000 into a form of Bitcoin where the value is pegged to the USD, not a random number (the price of Bitcoin)
18 | * take a picture of any object and find out where to buy it (Shazam for things)
19 | * have all my accounting being done as a SASS -- with 24/7 qualified accountants / planners / bill payers and a complete online record of my expense and balance sheet at all time available.
20 | * Enter all my favorite classical music playlists, and have it coordinate with my travel schedule to update me who is playing where.
21 | * Be able to take a course online and get graded -- and get a diploma that means something. Education is really ripe for disruption: Coursera is OK, but we need to invent Stanford 2.0
22 | * Be able to sell my advice online. It's worth something and I should have some way to monetize it.
23 | * Be able to see a list of all single people in LA right now and efficiently sort through this data, with two way opt-in, to find an ideal match
24 | * See videos of restaurants before you go to them. Pictures can be decieving. I want a video.
25 | * Be able to get a $100 MRI. It can be done for this price.
26 | * Be able to get into an ER for under $100. It's ridicoulous that a mere 15 min consultation can cost somebody (the system) $1000+.
27 |
28 |
29 |
30 | **First World Problems**
31 |
32 | * Invest $10,000 in Uber in the secondary market by buying some shares from an existing shareholder -- with just a few clicks.
33 | * Find out which are the hot new restaurants in Paris from people who actually know.
34 | * Find a teacher of general relativity online. I tried.
35 | * Travel to space -- click, pay about 20 million, book a trip to the space station.
36 | * Send $10,000 to my caretaker in France and have her get the cash and withdraw it from an ATM that same day.
37 |
38 |
39 |
40 | **Ignores Paying a Living Wage**
41 |
42 | *a.k.a, can be done right now but at a high price*
43 |
44 | * Click on a recipe, pick number of people, and have all the ingredients delivered to me the next day by Amazon Fresh
45 | * Get a qualified guitar / piano / cello teacher to come to my house without randomly calling people on craigslist or doing general google searches.
46 | * Get somebody to come to my house, pick up my dry cleaning and drop it back off the next day (Amazon "Clean").
47 |
48 |
49 | **Ignores Context**
50 |
51 | *a.k.a. provides the wrong incentives, or can be badly abused*
52 |
53 | * Get a discount from the federal government for being healthy. Fat people should pay more taxes because they cost society more. This means some approved weigh in and testing centers
54 | * Get investors for my startup by advertising the stock offering on the web and selling shares directly.
55 | * Get a quick, binding divorce online.
56 | * Keep track of where everybody in my team is physically right now.
57 | * Call the police or the fire department or paramedics using an app.
58 |
59 |
--------------------------------------------------------------------------------
/Principles_of_Performance_Tuning.md:
--------------------------------------------------------------------------------
1 | # Principles of Performance Tuning
2 |
3 | * Less business complexity
4 | * Do less work
5 | * Less technical/design complexity
6 | * More efficient systems
7 | * More efficient workload (CPU cycles / unit of work)
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | blog-drafts
2 | ===========
3 |
4 | Drafts and ideas for my blog
5 |
--------------------------------------------------------------------------------
/company size and culture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/company size and culture.png
--------------------------------------------------------------------------------
/crime-vs-incarceration.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/crime-vs-incarceration.jpg
--------------------------------------------------------------------------------
/darwin award.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/darwin award.jpg
--------------------------------------------------------------------------------
/data bias.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/data bias.jpg
--------------------------------------------------------------------------------
/einstein_ethics.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/einstein_ethics.jpg
--------------------------------------------------------------------------------
/equal-vs-fair.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/equal-vs-fair.png
--------------------------------------------------------------------------------
/math_for_grownups.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/math_for_grownups.jpg
--------------------------------------------------------------------------------
/mechanical_calculator.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/mechanical_calculator.jpg
--------------------------------------------------------------------------------
/precision-and-recall.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/precision-and-recall.jpg
--------------------------------------------------------------------------------
/resistance is just.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/resistance is just.jpg
--------------------------------------------------------------------------------
/student_debt.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/student_debt.jpg
--------------------------------------------------------------------------------
/wolf debt.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DevNambi/blog-drafts/dcb265a5d583c0b2f7243a12667d2895377f2631/wolf debt.png
--------------------------------------------------------------------------------