├── DataScienceEthics.md ├── EthicalAlgorithmTool.md ├── EthicalAlgorithmsSummary.pdf ├── LICENSE ├── README.md ├── data ├── 1.JPG ├── 10.JPG ├── 11.JPG ├── 12.JPG ├── 13.JPG ├── 14.JPG ├── 15.JPG ├── 16.JPG ├── 17.JPG ├── 18.JPG ├── 19.JPG ├── 2.JPG ├── 20.JPG ├── 21.JPG ├── 22.JPG ├── 23.JPG ├── 3.JPG ├── 4.JPG ├── 5.JPG ├── 6.JPG ├── 7.JPG ├── 8.JPG ├── 9.JPG └── disc_logo.JPG └── references.md /DataScienceEthics.md: -------------------------------------------------------------------------------- 1 | # Data Science Ethics 2 | 3 | ### Why are data science ethics important? 4 | Digital advances are producing huge amounts and new forms of data, allowing computers to more quickly process this data and make decisions without human oversight. This creates new opportunties and many new challenges we have not had to consider before. 5 | 6 | There are laws that set out important priciples on how you can use data. Those working with data should be aware of these laws and always act within them. 7 | 8 | Public attitudes to data are changing. Working with data in a way which makes the public feel uneasy, without adequate transparency or engagement could put your project at risk and also jeorpadise other projects. Consideration of public attitudes and communication is key. 9 | 10 | 11 | ## Some Key Principles 12 | 13 | 1. **Start with clear user need and public benefit** - Data science offers huge opportunities to create evidence for policy making and also make quicker and more accurate decisions. Being clear about the benefit you seek to achieve will help you justify the sensitivity of the data and the methods you want to use. Creating a use case means that you can translate why better understanding will have benefits for individuals. Creating a use case will help you to: 14 | 15 | * Consider what risks it justifies and therefore what data and method you should use; risk to privacy, risk of making mistakes and negative unintended consequences. 16 | * Start to think about what decisions might be taken as a result of the insight. 17 | 18 | 2. **Use data tools which have minimum intrusion necessary** - Use minimum data necessary. Sometimes you will need to use sensitive personal data. You can take steps to safeguard people’s privacy e.g de-identifying or aggregating data to higher levels or using synthetic data. Some data science projects have direct and tangible benefits to individuals & some will improve policymakers understanding so that they can develop better policy. Ways to do this include. 19 | 20 | - Only use personal data if similar insight or statistical benefit cannot be achieved using non-personal data 21 | - De-identify individuals or aggregate to higher geographical levels where possible 22 | - Use synthetic data to get results 23 | - Query against datasets through APIs rather than having access to the whole data set 24 | 25 | You must take reasonable steps to ensure that individuals will not be identifiable when you link data or combine it with 26 | other data in the public domain. The increasing number of data sets available now or in the future means that it might be 27 | easier to link to other open data sources to infer to an individual's identity or personal information about them. 28 | 29 | 3. **Create robust data science models** - Good machine learning models can analyse far larger amounts of data far more quickly and accurately than traditional methods. We should think through the quality and representativeness of the data, flag if algorithms are using protected characteristics e.g ethnicity to make decisions, think through unintended consequences. Complex decisions may well need the wider knowledge of policy or operational experts. Algorithms learn from large amounts of historical data to make decisions. However the quality of this data can affect algorithms and reinforce bias. Use techniques to spot the bias and then code in affirmative action to remoe bias. 30 | 31 | 4. **Be alert to public perceptions** - The law tells us what we can do but ethics tell us what we should do. Ethics become more important when advances in technology are pushing our understanding of the law to its limits. The law and ethical practice enables us to understand public opinion so we can work out what we should do. It is vital to understand both stated and revealed public opinion about how people would actually want the data you hold them to be used. Consult with others to work out if projects are acceptable. 32 | 33 | 5. **Be as open and accountable as possible** - Being open allows us to talk about the benefit of data science. Let people know about the social benefit of your work and the impact it has had on collective or individual social or financial outcomes. Aim to be open about the tools, data and algorithms (unless doing so would jeopardise the aim e.g fraud). Make sure there is oversight and acccountability throughout the project. 34 | 35 | 6. **Keep data secure** - The public are justifiably concerned about their data being lost or stolen and you have a responsibility to protect both personal and non-personal classified data. It's vital that we keep it secure. The law (e.g the [Data Protection Act](https://ico.org.uk/for-organisations/guide-to-data-protection/principle-5-retention/) provides the basis on how the data should be collected, shared, processed and deleted. 36 | 37 | -------------------------------------------------------------------------------- /EthicalAlgorithmTool.md: -------------------------------------------------------------------------------- 1 | # [Center for Democracy & Technology](https://cdt.org/) 2 | 3 | ## So You Want to Build an Ethical Algorithm... 4 | 5 | ℹ️ Interactive version: [cdt.info/ddtool](https://cdt.info/ddtool/) 6 | 7 | ### Design 8 | 9 | #### Concept 10 | As you design the goal and the purpose of your machine learning product, you must first ask: **Who is your audience?** 11 | 12 | * Is your product or analysis meant to include all people? 13 | * And, if not: is it targeted to an exclusive audience? 14 | * Is there a person on your team tasked specifically with identifying and resolving bias and discrimination issues? 15 | 16 | ![](data/1.JPG) 17 | 18 | #### Scope 19 | After you have defined the concept of your machine learning problem, you need to **identify possible outcomes and define criteria that contribute to them**. 20 | 21 | * How transparent will you be about the relationships between the inputs and the anticipated outcomes? 22 | * Are there sensitive characteristics you need to monitor in your data in order to observe their effect on your outputs? 23 | 24 | ![](data/2.JPG) 25 | 26 | Then you will need to **define expected results**. 27 | 28 | * Could your expectations rely on unacknowledged bias of you and your team? 29 | 30 | _One way to check_: have you asked a **diverse** audience if your expectations make sense to them? 31 | 32 | ![](data/3.JPG) 33 | 34 | #### Data 35 | Once the concept and scope have been defined, it is time to focus on the acquisition, evaluation, and cleaning of data. 36 | 37 | ##### Acquire Data (buy, collect, generate) 38 | 39 | * Did the data come from a system prone to human error? 40 | * Is the data current? 41 | * What technology facilitated the collection of the data? 42 | * Was participation of the data subjects voluntary? 43 | * Does the context of the collection match the context of your use? 44 | * Was your data collected by people or a system that was operating with quotas or a particular incentive structure? 45 | 46 | ![](data/4.JPG) 47 | 48 | ##### Evaluate & Describe Data 49 | 50 | * Who is represented in the data? 51 | * Who is under-represented or absent from your data? 52 | * Can you find additional data, or use statistical methods, to make your data more inclusive? 53 | * Was the data collected in an environment where data subjects had meaningful choices? 54 | * How does the data reflect the perspective of the institution that collected it? 55 | * Were fields within the data inferred or appended beyond what was clear to the data subject? 56 | * Would this use of the data surprise the data subjects? 57 | 58 | ![](data/5.JPG) 59 | 60 | ##### Clean Data 61 | 62 | * Are there any fields that should be eliminated from your data? 63 | * Can you use anonymization or pseudonymization techniques to avoid needless evaluation or processing of individual data? 64 | 65 | ![](data/6.JPG) 66 | 67 | #### Constrain 68 | 69 | ##### Establish logic for variables 70 | 71 | * Can you describe the logic that connects the variables to the output of your equation? 72 | * Do your variables have a causal relationship to the results they predict? 73 | * How did you determine what weight to give each variable? 74 | 75 | ![](data/7.JPG) 76 | 77 | ##### Identify assumptions 78 | 79 | * Will your variables apply equally across race, gender, age, disability, ethnicity, socioeconomic status, education, etc.? 80 | * What are you assuming about the kinds of people in your data set? 81 | * Would you be comfortable explaining your assumptions to the public? 82 | * What assumptions are you relying on to determine the relevant variables and their weights? 83 | 84 | ![](data/8.JPG) 85 | 86 | ##### Define success 87 | 88 | * What amount and type of error do you expect? 89 | * How will you ensure your system is behaving the way you intend? How reliable is it? 90 | 91 | ![](data/9.JPG) 92 | 93 | ### Build 94 | 95 | #### Data Process 96 | How will you choose your analytical method? For example, predictive analytics, machine learning (supervised, unsupervised), neural networks or deep learning, etc. 97 | 98 | * How much transparency does this method allow your end users and yourself? 99 | * Are non-deterministic outcomes acceptable given your legal or ethical obligations around transparency and explainability? 100 | * Does your choice of analytical method allow you to sufficiently explain your results? 101 | * What particular tasks are associated with the type of analytical method you are using? 102 | 103 | ![](data/10.JPG) 104 | 105 | #### Tools 106 | You have two options. The first: will you need to **choose tools from available libraries**? 107 | 108 | * How could results that look successful still contain bias? 109 | * Is there a trustworthy or audited source for the tools you need? 110 | * Have the tools you are using been associated with biased products? 111 | ![](data/11.JPG) 112 | 113 | Or will you **build new tools from scratch**? 114 | 115 | * Can you or a third-party test your tools for any features that can result in biased or unfair outcomes? 116 | 117 | ![](data/12.JPG) 118 | 119 | #### Feedback Mechanism 120 | 121 | ##### Internal 122 | 123 | * Have you built in a mechanism to track anomalous results so that they can be analyzed? 124 | * Is there a person on your team tasked with identifying the technical cause of biased outcomes? 125 | 126 | ![](data/13.JPG) 127 | 128 | ##### External 129 | 130 | * Is there a way for users to report problematic results including potentially discriminatory treatment? 131 | 132 | ![](data/14.JPG) 133 | 134 | ### Test 135 | 136 | #### Run Model 137 | After you audit the results, you must **test the model on preliminary data set(s)**. 138 | 139 | * Can you test your product using a data set that is representative (or over-samples) a diverse population based on race, socioeconomic status, gender, sexual orientation, ethnicity, age, disability, education, etc.? 140 | 141 | ![](data/18.JPG) 142 | 143 | #### Audit Results 144 | Then you wil need to **determine if your model performed in a way that you designed**. 145 | 146 | * Is a person accountable for addressing biased results and resolving issues with errors? 147 | * Did your feedback mechanism capture and report anomalous results in a way that allows you to check for biased outcomes? 148 | * What about the results is consistent with your expectations? Where do the results deviate? 149 | 150 | ![](data/22.JPG) 151 | 152 | #### Identify Errors 153 | 154 | ##### Examine the distribution of errors 155 | 156 | * Are errors evenly distributed across all demographics? 157 | * Is the type of error the same for different populations? (i.e., false positives vs. false negatives) 158 | 159 | ![](data/16.JPG) 160 | 161 | ##### Evaluate effect of error 162 | 163 | * What is the impact on an individual of a false positive? 164 | * What is the impact on an individual of a false negative? 165 | 166 | ![](data/17.JPG) 167 | 168 | #### Re-evaluate Variables and Data 169 | You will need to **address errors by changing your variables and/or data**. 170 | 171 | * Process any new data and variables with the same inquiry as the original model. 172 | * What factors are predominant in determining outcomes? 173 | * Are unintended factors or variables correlated with sensitive characteristics? 174 | 175 | ![](data/15.JPG) 176 | 177 | ### Implement 178 | 179 | #### Contextualize Results 180 | 181 | ##### Consider quality of results 182 | 183 | * What degree of confidence do you have in the results given the limitations of the data, etc.? 184 | * Are these results a good basis on which to make a decision? 185 | 186 | ![](data/19.JPG) 187 | 188 | ##### Interpret Results 189 | 190 | * Does your product work equally as well for all types of people? 191 | * Can you determine metrics that demonstrate the reliability of your model (the degree to which it performs as expected)? 192 | * Can you inform future users of the data, current data subjects, and other audiences of your product about its weaknesses and remaining biases? 193 | 194 | ![](data/23.JPG) 195 | 196 | #### Monitor Results 197 | 198 | ##### Critically examine results for disparate impacts. 199 | 200 | * Where could bias have come into this analysis? 201 | * Is there a way for users to appeal a decision? 202 | * Is there a way for data subjects to report that they may have been treated unfairly? 203 | * Do you discard the data? If not, how do you keep it secure? 204 | 205 | ![](data/20.JPG) -------------------------------------------------------------------------------- /EthicalAlgorithmsSummary.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/EthicalAlgorithmsSummary.pdf -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 NumFOCUS 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Guidelines for Ethical Modeling 2 | 3 | ![](data/disc_logo.JPG) 4 | 5 | ### Contents 6 | 7 | This repository contains a collection of materials designed to assist in the creation of ethical algorithms and machine learning models. Initial work was done as part of the [Diversity and Inclusion Unconference](https://pydata.org/nyc2017/diversity-inclusion/disc-unconference-2017/) at [PyData NYC 2017](https://pydata.org/nyc2017/). 8 | 9 | * [Ethical Machine Learning Tool](EthicalAlgorithmTool.md) - *How can you create ethical machine learning projects?* 10 | * **[Open the tool in your browser](https://cdt.info/ddtool/)** 11 | * [Data Science Ethics](DataScienceEthics.md) - *Why is ethical data science important?* 12 | * [Further Reading](references.md) - *Supporting materials and resources.* 13 | 14 | #### Contributing Authors 15 | 16 | * [Paige Bailey](http://www.github.com/dynamicwebpaige) (@DynamicWebPaige) 17 | * [Abhipsa Behera](http://www.github.com/abhipsa92) (@abhipsa92) 18 | * [Bernie Boscoe](http://www.github.com/bboscoe) (@bboscoe) 19 | * [Bobby Jackson](http://www.github.com/rcjackson) (@rcjackson) 20 | * [Mwai Karimi](http://www.github.com/kmwai) (@kmwai) 21 | * [Zairah Mustahsan](http://www.github.com/zairah10) (@zairah10) 22 | * [Disha Umarwani](http://www.github.com/dishaumarwani) (@dishaumarwani) 23 | -------------------------------------------------------------------------------- /data/1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/1.JPG -------------------------------------------------------------------------------- /data/10.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/10.JPG -------------------------------------------------------------------------------- /data/11.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/11.JPG -------------------------------------------------------------------------------- /data/12.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/12.JPG -------------------------------------------------------------------------------- /data/13.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/13.JPG -------------------------------------------------------------------------------- /data/14.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/14.JPG -------------------------------------------------------------------------------- /data/15.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/15.JPG -------------------------------------------------------------------------------- /data/16.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/16.JPG -------------------------------------------------------------------------------- /data/17.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/17.JPG -------------------------------------------------------------------------------- /data/18.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/18.JPG -------------------------------------------------------------------------------- /data/19.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/19.JPG -------------------------------------------------------------------------------- /data/2.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/2.JPG -------------------------------------------------------------------------------- /data/20.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/20.JPG -------------------------------------------------------------------------------- /data/21.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/21.JPG -------------------------------------------------------------------------------- /data/22.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/22.JPG -------------------------------------------------------------------------------- /data/23.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/23.JPG -------------------------------------------------------------------------------- /data/3.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/3.JPG -------------------------------------------------------------------------------- /data/4.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/4.JPG -------------------------------------------------------------------------------- /data/5.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/5.JPG -------------------------------------------------------------------------------- /data/6.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/6.JPG -------------------------------------------------------------------------------- /data/7.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/7.JPG -------------------------------------------------------------------------------- /data/8.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/8.JPG -------------------------------------------------------------------------------- /data/9.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/9.JPG -------------------------------------------------------------------------------- /data/disc_logo.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/numfocus/algorithm-ethics/bd4fffe7a7c377d0842ecf91e14930d712955fb0/data/disc_logo.JPG -------------------------------------------------------------------------------- /references.md: -------------------------------------------------------------------------------- 1 | ## References 2 | 3 | ### Algorithms are Biased 4 | 5 | * [Turns Out Algorithms Are Racist](https://newrepublic.com/article/144644/turns-algorithms-racist) 6 | 7 | We are at a stage in technology where we need to make wide decisions of how we wish to learn from the past. Do we wish to 8 | imitate it or bring about a change. We need to challenge the inequalities that are present in the society. The first stage 9 | for this is the design process where the Data Scientist needs to understand which features are making the features bais. 10 | One should not let the black box decide and offers pronouncements and that we are encouraged to obey. 11 | 12 | * [Algorithms and bias: What lenders need to know](www.whitecase.com/publications/insight/algorithms-and-bias-what-lenders-need-know) 13 | 14 | This article provides a guide on what banks need to consider when using AIs to make decisions regarding lending. It details 15 | several main points: 16 | 17 | * AIs learn by using preexisting data that has the desired result already determined through manual means, which is 18 | subject to bias. 19 | * While consumers can view their own credit report and check for its accuracy, there could be data from nontraditional 20 | sources which are not available to the consumer to check which lack transarency. This can even go as far as looking at 21 | social patterns, such as where a person shops or who they interact with. 22 | * Recommendations are made that lenders monitor changing regulations and test for potential bias. 23 | * Any feature in the algorithm should be carefully justified. 24 | * Algorithms to examine bias in AI algorithms should be developed. 25 | 26 | * [Why AI is still waiting for its ethics transplant](https://www.wired.com/story/why-ai-is-still-waiting-for-its-ethics-transplant/) 27 | 28 | This is an interview of Kate Crawford, a cofounder of AI Now, conducted by WIRED magazine. She is asked several questions 29 | which include the current state of ethics in AI, how AI developers need to both hire people outside of computer science to 30 | better understand social implications, and the state of government funding of ethics research under the Trump adminstration. 31 | 32 | * [The Dark Secret at the Heart of AI](https://www.technologyreview.com/s/604087/the-dark-secret-at-the-heart-of-ai/) 33 | 34 | This article goes into how, even though neural networks and deep learning algorithms have been designed by humans, humanity 35 | does not really know how these AIs work. They could use reason or use instinct like real people do. The inherent dangers in 36 | the unpredictability of AIs are summarized. 37 | 38 | * [How do machines learning algorithms learn bias?](https://towardsdatascience.com/how-do-machine-learning-algorithms-learn-bias-555809a1decb) 39 | 40 | An algorithm that has disparate impact is causing people to lose jobs, their social networks, and ensuring the worst cold start problem once someone has been released from prison. At the same time, people likely to commit crimes in the future are let to go free because the algorithm is blind to their criminality. 41 | 42 | Machine Learning bias is caused by source data. Data munging is not fun, and thinking about sampling and outliers and population distributions of the training set can be boring, tedious work. Indeed, machines learn bias from the oversights that occur during data munging. 43 | 44 | * [Bias in software to predict future criminals - biased against blacks](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing) 45 | 46 | ### Solutions 47 | 48 | * [Counterfactual Fairness](https://arxiv.org/pdf/1703.06856.pdf) 49 | 50 | Causal models capture social biases and make clear the implicit trade-off between prediction accuracy and fairness in an 51 | unfair world. 52 | 53 | When is your model fair? 54 | 55 | * An algorithm is fair so long as any protected attributes A are not explicitly used in the decision-making process. 56 | 57 | * An algorithm is fair if it gives similar predictions to similar individuals. 58 | 59 | * A predictor Yˆ satisfies demographic parity if P(Yˆ |A = 0) = P(Yˆ |A = 1) 60 | 61 | * A predictor Yˆ satisfies equality of opportunity if P(Yˆ = 1|A = 0, Y = 1) = P(Yˆ = 1|A = 1, Y = 1). 62 | 63 | 64 | * [How We Examined Racial Discrimination in Auto Insurance Prices](https://www.propublica.org/article/minority-neighborhoods-higher-car-insurance-premiums-methodology) 65 | 66 | As AI becomes more and more complex, it can become difficult for even its own designers understand why it acts the way it 67 | does. A nationwide study by the Consumer Federation of America in 2015 found that predominantly African-American 68 | neighborhoods pay 70 percent more, on average, for premiums than other areas do. We analyzed aggregate risk data by zip 69 | code collected by the insurance commissioners of California, Illinois, Missouri and Texas from insurers in their states. 70 | 71 | * They found that some insurers were charging statistically significantly higher premiums in predominantly minority zip 72 | codes, on average, than in similarly risky non-minority zip codes. 73 | 74 | * Studies of auto insurance rates in Texas found that “drivers in poor and minority communities were disproportionately 75 | rejected by standard insurers and forced into the higher cost non-standard” insurance plans that are designed as a last 76 | resort for people who can’t otherwise buy insurance. 77 | 78 | Similar statistics were shown in other states as well. Insurance companies fight that it is completely based not risk and 79 | they have removed racial features while training their model but demographies are the latent features that are strongly 80 | correlated with communities. These latent features bring bais in the model. 81 | 82 | * [How I'm Fighting Bias in Algorithms (Ted Talk)](http://www.ted.com/talks/joy_buolamwini_how_i_m_fighting_bias_in_algorithms) 83 | 84 | What if you go to the washroom and the tap does not respond to your hands but it does for your your friend’s. Computer 85 | Vision uses images to recognize people and objects. But what if we train it with undiversified dataset? It will only 86 | respond to white skin because that is only what it has seen. There have been reported indices when a webcam is not able to 87 | detect black faces.[1] Not only this, image recognition softwares have been associating cleaning or the kitchen with women, 88 | for example and sports with men. 89 | 90 | There needs to be a system that audits the existing software for impact that it has on minorities. There is a need to 91 | diversify our data. You can [report](https://www.ajlunited.org/fight) discrimination that you’ve faced while using AI 92 | 93 | * [Controlling machine learning algorithms and their biases](https://www.mckinsey.com/business-functions/risk/our-insights/controlling-machine-learning-algorithms-and-their-biases) 94 | 95 | * [Attacking Discrimination in ML- google research paper](https://research.google.com/bigpicture/attacking-discrimination-in-ml/) 96 | 97 | * [Equality Opportunity](https://drive.google.com/file/d/0B-wQVEjH9yuhanpyQjUwQS1JOTQ/view) 98 | 99 | * [Link to White Paper Containing Question Tool](https://cdt.org/issue/privacy-data/digital-decisions/) 100 | 101 | ### Involved Organizations and Communities 102 | 103 | * [AI Now](https://ainowinstitute.org/) 104 | 105 | AI NOW is a research center based out of New York University focusing on the social implications of AI. Their research 106 | focuses on studying biases made by AIs that are used to make decisions related to housing, criminal justice, and employment. 107 | They also focus on determining the bias in datasets used to train AIs. Finally, they study how to safely integrate AIs into 108 | algorithms used by critical infrastructures. 109 | 110 | * [AI Now report](https://assets.contentful.com/8wprhhvnpfc0/1A9c3ZTCZa2KEYM64Wsc2a/8636557c5fb14f2b74b2be64c3ce0c78/_AI_Now_Institute_2017_Report_.pdf) 111 | 112 | The AI NOW report summaries several recommendations made by AI NOW in the implementation of AI algorithms for making 113 | decisions in hiring, housing, as well as addressing biases in AI research itself. The report also contains a comprehensive 114 | literature review summarizing recent work in studying the social implications of AI. 115 | 116 | * [DeepMind Ethics Research Group - Google](https://deepmind.com/applied/deepmind-ethics-society/research/) 117 | 118 | DeepMind's Ethics & Society is a research unit within DeepMind that aims to explore and better understand the real-world 119 | impacts of AI. It aims to help technologists put ethics into practice and help society anticipate and control the effects 120 | of AI, by enlisting some questions around: 121 | * privacy 122 | * transparency 123 | * fairness 124 | * governence and accountability that should be addressed throughout the life cycle of projects. 125 | 126 | * [Why We Launched DeepMind Ethics & Society](https://deepmind.com/blog/why-we-launched-deepmind-ethics-society/) 127 | 128 | Quoting DeepMind's Ethics webpage: 129 | 130 | "The development of AI creates important and complex questions. Its impact on society—and on all our lives—is not something that should be left to chance. Beneficial outcomes and protections against harms must be actively fought for and built-in 131 | from the beginning. But in a field as complex as AI, this is easier said than done" 132 | 133 | DeepMind being the world leader in artificial intelligence research and its application for positive impact, provides clear motivation and need to include ethics in AI. It is imperative to have similar ethics groups working hand in hand with the development teams in all organizations. 134 | 135 | * [Responsible Data Science](http://www.responsibledatascience.org/) 136 | 137 | Responsible Data Science is a novel collaboration between principal scientists from 11 universities and research institutes 138 | in the Netherlands. RDS aims to provide the technology to build in fairness, accuracy, confidentiality, and transparency in 139 | systems thus ensuring responsible use without inhibiting the power of data science. They conduct seminars and workshops (so 140 | far within Netherlands) and the material is available on their webpage as well. 141 | 142 | * [principles for accountable algorithms](https://www.fatml.org/resources/principles-for-accountable-algorithms) 143 | 144 | * [IEEE standards for AI ethics](http://standards.ieee.org/develop/indconn/ec/autonomous_systems.html) 145 | 146 | * [Fairness, Accountability, and Transparency in Machine Learning](https://fatconference.org/resources.html) 147 | 148 | ### Miscellaneous 149 | 150 | * [make algorithms accountable](https://www.nytimes.com/2016/08/01/opinion/make-algorithms-accountable.html) 151 | 152 | This is an article about how there needs to be increased transparency and accountability when it comes to examining 153 | data used by algorithms. A case of where an algorithm detecting the risk of criminal recidivism was biased against black 154 | defendants is established as an example for such a need. The article goes over efforts by the European Union to regulate 155 | such algorithms as well as recommendations from the White House on holding such algorithms accountable. 156 | 157 | * [IEEE standards for AI ethics](http://standards.ieee.org/develop/indconn/ec/autonomous_systems.html) 158 | 159 | The IEEE has published standards for the ethical application of AI. They are group into these broad categories covering 160 | social, privacy, and military impacts: 161 | 162 | * Executive Summary 163 | * General Principles 164 | * Embedding Values Into Autonomous Intelligent Systems 165 | * Methodologies to Guide Ethical Research and Design 166 | * Safety and Beneficence of Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI) 167 | * Personal Data and Individual Access Control 168 | * Reframing Autonomous Weapons Systems 169 | * Economics/Humanitarian Issues 170 | * Law 171 | 172 | * [Algorithmic accountability](https://techcrunch.com/2017/04/30/algorithmic-accountability/) 173 | 174 | * [Partnership in AI](https://www.partnershiponai.org) 175 | 176 | * [Book: What Algorithms Want - Ed Finn](https://mitpress.mit.edu/books/what-algorithms-want) 177 | 178 | --------------------------------------------------------------------------------