├── .DS_Store
├── Ads-Click-Prediction-Report
├── README.md
└── ad-click-prediction-dt.ipynb
├── README.md
├── Who Get The Job (Turkish)
├── README.md
└── who-gets-the-job-veri-analizi.ipynb
└── Wine Production by Country
├── .DS_Store
├── DataSet
└── Wine_Production_by_country.xlsx
├── README.md
└── wine-production-analysis.ipynb
/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leventozdemir/Data-Analysis-Projects/86dbcffb97c0cb827f851762e465f13ee8c0560a/.DS_Store
--------------------------------------------------------------------------------
/Ads-Click-Prediction-Report/README.md:
--------------------------------------------------------------------------------
1 | # Ads-Click-Prediction-Report:
2 |
3 |
4 |
5 | - DataSet: Ad Click Prediction
6 | - Problem: Classification Problem
7 | - DataSet Source: https://www.kaggle.com/jahnveenarang/cvdcvd-vd - DataSet Description:
8 |
9 | 1. 'User ID': unique identification for consumers.
10 | 2. 'Age': customer age in years.
11 | 3. 'Estimated Salary': Avg. Income of consumer.
12 | 4. 'Gender': Whether the consumer was male or female.
13 | 5. 'Purchased': 0 or 1 indicated clicking on Ad.
14 |
15 | - Features after Preprocessing:
16 |
17 | 1. 'Age': customer age in years.
18 | 2. 'Estimated Salary': Avg. Income of consumer.
19 | 3. 'Gender': Whether the consumer was male[0] or female[1].
20 | 4. 'Purchased': 0 or 1 indicated clicking on Ad.
21 |
22 |
23 | - Machine Learning Model and Evaluation:
24 | ➔ Decision Tree Classifier model.
25 | ➔ Accuracy Score for evaluation.
26 |
27 |
28 | # Male Samples
29 |
30 |
31 |
32 |
33 | -Males between [40,60] clicks with negative correlation to the Salary.
34 |
35 | -Males between [20,40] don't click with positive correlation to the Salary.
36 |
37 |
38 |
39 | # Female Samples
40 |
41 |
42 |
43 |
44 | -Females with a salary between [8000,140000] clicks with negative correlation to age.
45 |
46 | -Females with a salary between [20000,80000] clicks with positive correlation to age.
47 |
48 | # Age>40 for both genders
49 |
50 | Female Makes more click then male
51 |
52 | # Decision Tree Model
53 | 
54 |
55 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Data-Analysis-Projects
2 |
3 |
4 |
5 | ## [- Ads Click Prediction Report:](https://github.com/leventozdemir/Data-Analysis-Projects/tree/main/Ads-Click-Prediction-Report)
6 |
7 | **-Dataset Source:** https://www.kaggle.com/jahnveenarang/cvdcvd-vd
8 |
9 | **-Code On Kaggle:** https://www.kaggle.com/leventoz/ad-click-prediction-dt
10 |
11 |
12 |
13 | ## [- Who Get The Job (Turkish):](https://github.com/leventozdemir/Data-Analysis-Projects/tree/main/Who%20Get%20The%20Job%20(Turkish))
14 |
15 | **-Dataset Source:** https://www.kaggle.com/c/datathon-who-gets-the-job
16 |
17 | **-Code On Kaggle:** https://www.kaggle.com/leventoz/who-gets-the-job-veri-analizi-yetenek-hari
18 |
19 |
20 |
21 |
22 | ## [- Wine Production by Country:](https://github.com/leventozdemir/Data-Analysis-Projects/tree/main/Wine%20Production%20by%20Country)
23 |
24 | **-Dataset Source:** https://www.kaggle.com/shitalgaikwad123/wine-production-by-country
25 |
26 | **-Code On Kaggle:** https://www.kaggle.com/leventoz/wine-production-analysis
27 |
28 |
--------------------------------------------------------------------------------
/Who Get The Job (Turkish)/README.md:
--------------------------------------------------------------------------------
1 | # Who Get The Job:
2 |
3 |
4 |
5 | # Description:
6 | Techcareer.net olarak düzenlediğimiz bu Datathon’da, Kariyer.net’te son 3 yılda işe alım yapan teknoloji alanındaki 2.163 iş ilanı için, başvuru yapan 348.142 adaya ait 1.059.462 başvuru arasından ilanda aranan nitelikleri ve adayların özgeçmişlerini kıyaslayan bir algoritma geliştirmeni ve işi kapan adayı bulmanı bekliyoruz.
7 |
8 | # Data:
9 | Son 3 yıl içerisinde teknoloji alanında Kariyer.net üzerinden işe alınan adayların başvurdukları ilanlar sizinle paylaşılmıştır. Sizlerden beklediğimiz, test.csv’de vermiş olduğumuz iş ilanlarının hangi aday tarafından kapıldığını bulman! Kullanabileceğiniz 6 farklı train datası paylaşıyoruz. Data sekmesinde detaylı açıklamaları bulabilirsiniz.
10 |
11 | ## - Datathon_Aday.csv icin:
12 | Aday ile ilgili okul bilgileri ve yaşadığı şehri
13 | ## - Datathon_Basvuru.csv:
14 | Hangi adayın hangi iş ilanına başvurduğu,
15 | ## - Datathon_Basvuru_iseAlinanlar.csv:
16 | Başvurusu sonucu işe alınan adayları,
17 | ## - Datathon_ilan.csv:
18 | İlan metni, pozisyon adı ve lokasyon gibi bilgileri,
19 | ## - Datathon_Tecrube.csv
20 | Adayların geçmiş iş tecrübesini,
21 |
--------------------------------------------------------------------------------
/Wine Production by Country/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leventozdemir/Data-Analysis-Projects/86dbcffb97c0cb827f851762e465f13ee8c0560a/Wine Production by Country/.DS_Store
--------------------------------------------------------------------------------
/Wine Production by Country/DataSet/Wine_Production_by_country.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leventozdemir/Data-Analysis-Projects/86dbcffb97c0cb827f851762e465f13ee8c0560a/Wine Production by Country/DataSet/Wine_Production_by_country.xlsx
--------------------------------------------------------------------------------
/Wine Production by Country/README.md:
--------------------------------------------------------------------------------
1 | # Wine Production by Country
2 |
3 | 
4 |
5 | ## - Library to install:
6 | pip install openpyxl
7 |
8 | ## - DataSet:
9 | - Context: this dataset represents wine production by each country.
10 |
11 | - Content: It can be used to by beginners for EDA, Data visualization etc.
12 |
13 | - Acknowledgements: DataWorld.
14 | - File Type: Excel.
15 |
16 |
17 | ## Calling libraries and data:
18 |
19 | import pandas as pd
20 | import matplotlib.pyplot as plt
21 | %matplotlib inline
22 | import seaborn as sns
23 |
24 | DATA_PATH = "DataSet/Wine_Production_by_country.xlsx"
25 | data= pd.read_excel(DATA_PATH, names=["Country","Year", "Wine production in mhl"], engine='openpyxl')
26 |
27 | ## Exploring the data:
28 | data.head()
29 | data.shape
30 | data.isnull().sum()
31 | data.columns
32 | countires= data.Country.unique()
33 | print(countires)
34 | print(len(countires))
35 | data.describe().transpose()
36 |
37 |
38 |
39 |
40 |
41 | ## Exploration Report:
42 |
43 | - #### We have 120 line and 3 features
44 | - #### No null Values
45 | - #### the countries we have (24):
46 | - 'Italy'
47 | - 'France'
48 | - 'Spain'
49 | - 'United States'
50 | - 'Australia'
51 | - 'China'
52 | - 'South Africa'
53 | - 'Chile'
54 | - 'Argentina'
55 | - 'Germany'
56 | - 'Portugal'
57 | - 'Russia'
58 | - 'Romania'
59 | - 'New Zealand'
60 | - 'Greece'
61 | - 'Serbia'
62 | - 'Austria'
63 | - 'Hungary'
64 | - 'Moldova'
65 | - 'Brazil'
66 | - 'Bulgaria'
67 | - 'Georgia'
68 | - 'Switzerland'
69 | - 'World total'
70 |
71 |
72 |
73 |
74 |
75 |
76 |
77 |
78 |
79 | ## Data Visualization:
80 | data.hist(figsize=(15,5))
81 | 
82 |
83 | plt.figure(figsize=(10,10))
84 | plot =sns.barplot(data['Year'], data["Wine production in mhl"])
85 |
86 | 
87 |
88 | plt.figure(figsize=(10,10))
89 | plot =sns.barplot(data['Year'], data["Country"])
90 | plot.set(xlim=(data['Year'].min(), data['Year'].max()))
91 |
92 | 
93 |
94 | data['Country'].value_counts().plot(kind='pie', figsize=(10,10), autopct="%1.2f%%")
95 | plt.title("Country Pie")
96 | plt.show()
97 |
98 | 
99 |
100 | data['Wine production in mhl'].value_counts().plot(kind='pie', figsize=(10,10), autopct="%1.2f%%")
101 | plt.title("Wine production in mhl Pie")
102 | plt.show()
103 |
104 | 
105 |
106 | data_hight= data.loc[data["Wine production in mhl"]==2.3]
107 | data_hight['Country'].value_counts().plot(kind='pie', figsize=(10,10), autopct="%1.2f%%")
108 | plt.title("Country Pie")
109 | plt.show()
110 |
111 | 
112 |
113 |
114 |
115 |
116 |
117 | #### The End:**By looking to the pie we can see that 2.3 is the most value repated in the data set so let's take it as our refference because too much production means too much using for this 2.3 is the most wanted type and we will see which country make production more.**
118 |
119 |
120 |
121 |
122 |
123 |
124 |
125 |
126 |
127 |
--------------------------------------------------------------------------------