├── .DS_Store ├── Ads-Click-Prediction-Report ├── README.md └── ad-click-prediction-dt.ipynb ├── README.md ├── Who Get The Job (Turkish) ├── README.md └── who-gets-the-job-veri-analizi.ipynb └── Wine Production by Country ├── .DS_Store ├── DataSet └── Wine_Production_by_country.xlsx ├── README.md └── wine-production-analysis.ipynb /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leventozdemir/Data-Analysis-Projects/86dbcffb97c0cb827f851762e465f13ee8c0560a/.DS_Store -------------------------------------------------------------------------------- /Ads-Click-Prediction-Report/README.md: -------------------------------------------------------------------------------- 1 | # Ads-Click-Prediction-Report: 2 | 3 | 4 | 5 | - DataSet: Ad Click Prediction 6 | - Problem: Classification Problem 7 | - DataSet Source: https://www.kaggle.com/jahnveenarang/cvdcvd-vd - DataSet Description: 8 | 9 | 1. 'User ID': unique identification for consumers. 10 | 2. 'Age': customer age in years. 11 | 3. 'Estimated Salary': Avg. Income of consumer. 12 | 4. 'Gender': Whether the consumer was male or female. 13 | 5. 'Purchased': 0 or 1 indicated clicking on Ad. 14 | 15 | - Features after Preprocessing: 16 | 17 | 1. 'Age': customer age in years. 18 | 2. 'Estimated Salary': Avg. Income of consumer. 19 | 3. 'Gender': Whether the consumer was male[0] or female[1]. 20 | 4. 'Purchased': 0 or 1 indicated clicking on Ad. 21 | 22 | 23 | - Machine Learning Model and Evaluation: 24 | ➔ Decision Tree Classifier model. 25 | ➔ Accuracy Score for evaluation. 26 | 27 | 1 28 | # Male Samples 29 | 2 30 | 31 | 3 32 | 33 | -Males between [40,60] clicks with negative correlation to the Salary. 34 | 35 | -Males between [20,40] don't click with positive correlation to the Salary. 36 | 37 | 38 | 39 | # Female Samples 40 | 4 41 | 5 42 | 43 | 44 | -Females with a salary between [8000,140000] clicks with negative correlation to age. 45 | 46 | -Females with a salary between [20000,80000] clicks with positive correlation to age. 47 | 48 | # Age>40 for both genders 49 | 6 50 | Female Makes more click then male 51 | 52 | # Decision Tree Model 53 | ![diab](https://user-images.githubusercontent.com/51120437/126098495-c955b43d-74ab-49f0-b0f7-fc8af79d3807.jpg) 54 | 55 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Data-Analysis-Projects 2 | 3 |
4 | 5 | ## [- Ads Click Prediction Report:](https://github.com/leventozdemir/Data-Analysis-Projects/tree/main/Ads-Click-Prediction-Report) 6 | 7 | **-Dataset Source:** https://www.kaggle.com/jahnveenarang/cvdcvd-vd 8 | 9 | **-Code On Kaggle:** https://www.kaggle.com/leventoz/ad-click-prediction-dt 10 | 11 | 12 | 13 | ## [- Who Get The Job (Turkish):](https://github.com/leventozdemir/Data-Analysis-Projects/tree/main/Who%20Get%20The%20Job%20(Turkish)) 14 | 15 | **-Dataset Source:** https://www.kaggle.com/c/datathon-who-gets-the-job 16 | 17 | **-Code On Kaggle:** https://www.kaggle.com/leventoz/who-gets-the-job-veri-analizi-yetenek-hari 18 | 19 | 20 | 21 | 22 | ## [- Wine Production by Country:](https://github.com/leventozdemir/Data-Analysis-Projects/tree/main/Wine%20Production%20by%20Country) 23 | 24 | **-Dataset Source:** https://www.kaggle.com/shitalgaikwad123/wine-production-by-country 25 | 26 | **-Code On Kaggle:** https://www.kaggle.com/leventoz/wine-production-analysis 27 | 28 | -------------------------------------------------------------------------------- /Who Get The Job (Turkish)/README.md: -------------------------------------------------------------------------------- 1 | # Who Get The Job: 2 | 3 |
4 | 5 | # Description: 6 | Techcareer.net olarak düzenlediğimiz bu Datathon’da, Kariyer.net’te son 3 yılda işe alım yapan teknoloji alanındaki 2.163 iş ilanı için, başvuru yapan 348.142 adaya ait 1.059.462 başvuru arasından ilanda aranan nitelikleri ve adayların özgeçmişlerini kıyaslayan bir algoritma geliştirmeni ve işi kapan adayı bulmanı bekliyoruz. 7 | 8 | # Data: 9 | Son 3 yıl içerisinde teknoloji alanında Kariyer.net üzerinden işe alınan adayların başvurdukları ilanlar sizinle paylaşılmıştır. Sizlerden beklediğimiz, test.csv’de vermiş olduğumuz iş ilanlarının hangi aday tarafından kapıldığını bulman! Kullanabileceğiniz 6 farklı train datası paylaşıyoruz. Data sekmesinde detaylı açıklamaları bulabilirsiniz. 10 | 11 | ## - Datathon_Aday.csv icin: 12 | Aday ile ilgili okul bilgileri ve yaşadığı şehri 13 | ## - Datathon_Basvuru.csv: 14 | Hangi adayın hangi iş ilanına başvurduğu, 15 | ## - Datathon_Basvuru_iseAlinanlar.csv: 16 | Başvurusu sonucu işe alınan adayları, 17 | ## - Datathon_ilan.csv: 18 | İlan metni, pozisyon adı ve lokasyon gibi bilgileri, 19 | ## - Datathon_Tecrube.csv 20 | Adayların geçmiş iş tecrübesini, 21 | -------------------------------------------------------------------------------- /Wine Production by Country/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leventozdemir/Data-Analysis-Projects/86dbcffb97c0cb827f851762e465f13ee8c0560a/Wine Production by Country/.DS_Store -------------------------------------------------------------------------------- /Wine Production by Country/DataSet/Wine_Production_by_country.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leventozdemir/Data-Analysis-Projects/86dbcffb97c0cb827f851762e465f13ee8c0560a/Wine Production by Country/DataSet/Wine_Production_by_country.xlsx -------------------------------------------------------------------------------- /Wine Production by Country/README.md: -------------------------------------------------------------------------------- 1 | # Wine Production by Country 2 | 3 | ![intro_wme](https://user-images.githubusercontent.com/51120437/126898167-e48e873b-4ee7-4c40-873c-17ea76ac1abd.jpeg) 4 | 5 | ## - Library to install: 6 | pip install openpyxl 7 | 8 | ## - DataSet: 9 | - Context: this dataset represents wine production by each country. 10 | 11 | - Content: It can be used to by beginners for EDA, Data visualization etc. 12 | 13 | - Acknowledgements: DataWorld. 14 | - File Type: Excel. 15 | 16 | 17 | ## Calling libraries and data: 18 | 19 | import pandas as pd 20 | import matplotlib.pyplot as plt 21 | %matplotlib inline 22 | import seaborn as sns 23 | 24 | DATA_PATH = "DataSet/Wine_Production_by_country.xlsx" 25 | data= pd.read_excel(DATA_PATH, names=["Country","Year", "Wine production in mhl"], engine='openpyxl') 26 | 27 | ## Exploring the data: 28 | data.head() 29 | data.shape 30 | data.isnull().sum() 31 | data.columns 32 | countires= data.Country.unique() 33 | print(countires) 34 | print(len(countires)) 35 | data.describe().transpose() 36 | 37 | 38 | 39 | 40 | 41 | ## Exploration Report: 42 | 43 | - #### We have 120 line and 3 features 44 | - #### No null Values 45 | - #### the countries we have (24): 46 | - 'Italy' 47 | - 'France' 48 | - 'Spain' 49 | - 'United States' 50 | - 'Australia' 51 | - 'China' 52 | - 'South Africa' 53 | - 'Chile' 54 | - 'Argentina' 55 | - 'Germany' 56 | - 'Portugal' 57 | - 'Russia' 58 | - 'Romania' 59 | - 'New Zealand' 60 | - 'Greece' 61 | - 'Serbia' 62 | - 'Austria' 63 | - 'Hungary' 64 | - 'Moldova' 65 | - 'Brazil' 66 | - 'Bulgaria' 67 | - 'Georgia' 68 | - 'Switzerland' 69 | - 'World total' 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | ## Data Visualization: 80 | data.hist(figsize=(15,5)) 81 | ![bar](https://user-images.githubusercontent.com/51120437/126898332-688972e5-85d0-4d58-b3e5-e4963b2a3f83.png) 82 | 83 | plt.figure(figsize=(10,10)) 84 | plot =sns.barplot(data['Year'], data["Wine production in mhl"]) 85 | 86 | ![2](https://user-images.githubusercontent.com/51120437/126898344-9247c24b-3d89-44b7-8a11-af9070dbfd18.png) 87 | 88 | plt.figure(figsize=(10,10)) 89 | plot =sns.barplot(data['Year'], data["Country"]) 90 | plot.set(xlim=(data['Year'].min(), data['Year'].max())) 91 | 92 | ![3](https://user-images.githubusercontent.com/51120437/126898354-8d99c0f7-2b01-4cef-864e-0b7ea66c708b.png) 93 | 94 | data['Country'].value_counts().plot(kind='pie', figsize=(10,10), autopct="%1.2f%%") 95 | plt.title("Country Pie") 96 | plt.show() 97 | 98 | ![3](https://user-images.githubusercontent.com/51120437/126898369-d4ecb834-aadb-45b3-a24d-e815952b6eb8.png) 99 | 100 | data['Wine production in mhl'].value_counts().plot(kind='pie', figsize=(10,10), autopct="%1.2f%%") 101 | plt.title("Wine production in mhl Pie") 102 | plt.show() 103 | 104 | ![3](https://user-images.githubusercontent.com/51120437/126898375-76f2557e-7c14-45e6-9726-d6422796a18d.png) 105 | 106 | data_hight= data.loc[data["Wine production in mhl"]==2.3] 107 | data_hight['Country'].value_counts().plot(kind='pie', figsize=(10,10), autopct="%1.2f%%") 108 | plt.title("Country Pie") 109 | plt.show() 110 | 111 | ![3](https://user-images.githubusercontent.com/51120437/126898384-96f2f864-8bf4-4abb-b420-cdae3ea6ce0c.png) 112 | 113 | 114 | 115 | 116 | 117 | #### The End:**By looking to the pie we can see that 2.3 is the most value repated in the data set so let's take it as our refference because too much production means too much using for this 2.3 is the most wanted type and we will see which country make production more.** 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | --------------------------------------------------------------------------------