├── README.md
├── numpy.md
├── pandas.md
└── scikit-learn.md
/README.md:
--------------------------------------------------------------------------------
1 | # Code Snippet
2 |
3 | ## Pandas
4 | https://github.com/dmahdipour/code-snippet/blob/main/pandas.md
5 |
6 | ## Numpy
7 | https://github.com/dmahdipour/code-snippet/blob/main/numpy.md
8 |
9 | ## Scikit-learn
10 | https://github.com/dmahdipour/code-snippet/blob/main/scikit-learn.md
11 |
--------------------------------------------------------------------------------
/numpy.md:
--------------------------------------------------------------------------------
1 | ### 1- Install
2 | `pip install numpy`
3 |
4 | ### 2- Import
5 | `import numpy as np`
6 |
7 | ### 3- Define an array (Matrix with 2 rows and 3 columns)
8 | `np.array([[1, 2, 3],[4, 5, 6]])` *If dimentions are not equal, we have list in array insted of numbers:* `np.array([[1, 2, ],[4, 5, 6]])`
9 |
10 | ### 4- Define an array (Ordinary Numbers)
11 | `np.arange(2, 9, 0.1)` *Start from: 2, end: 10, steps:0.1*
12 | *Use int numbers for step if you want to have integer numbers:* `np.arange(2, 9, 3)`
13 | `np.ones((3,4), dtype=int)` *Whole numbers are 1 and for 1D array:* `np.ones((3))` default dtype is float
14 | *To have other numbers (e.g. 5) instead of 1:* `np.full((3,4), 5)`
15 | `np.zeros((3,4), dtyp=int)` *Whole numbers are 0 and for 1D array:* `np.zeros((3))`
16 | `np.eye(3)` *Create ab array with whole 1 in diameter and fill others by 0*
17 | `np.linspace(1,4, num=10)` *Create an 10 items array start from 1 end by 4*
18 | `np.random.random((3,4))` *Create an array with random items between 3 and 4 or get 9 random integer numbers between 0 and 10* `np.random.randint(10, size=9).reshape(3,3)`
19 |
20 | ### 5- Change Dimention's size of arrays
21 | `myArray=np.arange(12).reshape(3, 4)` *To revert 2D array to one:* `myArray.reshape(12)` *to convert it in 12 row:* `myArray.reshape(1,12)`
22 | `myArray.flattern()` *Convert any arrays to 1D*
23 |
24 | ### 6- Update Array's Values
25 | `myArray[2]=30` *or* `myArray[1:4]=10` *To update range of indexes:* `myArray[0:3]` *and for 2D:* `myArray[1][0:3]`
26 |
27 | ### 7- Get Array's Dimention, Lenghth and Size:
28 | `myArray.ndim` *1 for 1D, 2 for 2D ,etc.*
29 | `myArray.shape` *e.g. (3, 4) - to get rows number:* `myArray.shape[0]` *or* `len(myArray)` *and to get columns number:* `myArray.shape[1]`
30 | `myArray.size` *Get count of whole items. e.g. 12*
31 |
32 | ### 8- Convert List to Array
33 | ```
34 | myList=[[1,2,3],[4,5,6]]
35 | myArray=np.asarray(myList)
36 | ```
37 |
38 | ### 9- Built-in Functions
39 | *Add axis=0 for finding values throught the columns and axis=1 for rows in 2D array*
40 | `np.min(myArray, axis=0)` `np.max(myArray)` *Min and Max*
41 | `np.argmin(myArray)` `np.argmax(myArray, axis=1)` *Indexes of Min and Max items*
42 | `np.mean(myArray, axis=1)` *Average of whole items*
43 | `np.median(myArray, axis=0)` *Get an item located in middel of array*
44 | `np.std(myArray, axis=0)` *Standard Deviation of whole items*
45 | `np.sum(myArray, axis=1)` *Sum whole items by columns or rows*
46 |
47 | ### 10- Indexes and Accesses to item
48 | `myArray[1]` *or for 2D:* `myArray[1,2]` *First index is for row and second one for column*
49 | `myArray[0:2,:]` *Get items in rows 0 to 2 that are in whole columns*
50 | `myArray[:,1:]` *Get items in whole rows but columns start from 1*
51 | `myArray[-3:]` *Get just 3 latest items*
52 | `myArray[:-2]` *Get whole items but 2 latest ones*
53 |
54 | ### 11- Loops
55 | ```
56 | for row in myArray:
57 | print(row+3)
58 |
59 |
60 | for row in myArray:
61 | for item in row:
62 | print(item*3)
63 | ```
64 |
65 | ### 12- Mathmatics an Logis
66 | `myArray.T` *Transpose*
67 | `myArray+10` *Plus 10 to items one by one. Also, Mines, Multipy, Division, Pow ,etc acan be used.*
68 | `myArray + myArray2` *Sum items of 2 arrays one by one*
69 | `myArray>2` *Get True if our condition are true for each one and vise verse. To get items instead of True/False:* `myArray[myArray>2]`
70 | `np.sqrt(myArray)` *Also Ceil, Sin, Cos, Tan, Arcsin, Arccos, Arctan and etc. can be used.*
71 | *for more:* https://numpy.org/doc/stable/reference/routines.math.html
72 | `myArray.dot(myArray2)` *or* `np.dot(myArray, myArray2)` *or* `np.matmul(myArray, myArray2)`
73 | *number of column in first array should be equal to number of rows in second one. e.g. 2x**3** and **3**x5*
74 | `np.linalg.det(myArray)` *To get Determinant, array should be squar in dimention. This is a kind of Algebra calculation (linalg).*
75 |
76 | ### 13- Mixup Arrays (Whole arrays should have same dimentions)
77 | `np.vstack((myArray, myArray2))` *Mix to array together Vertically. Rows of second array come under the first one.*
78 | `np.hstack((myArray, myArray2))` *Mix to array together Horizontally. Rows of second array come near the first one.*
79 |
--------------------------------------------------------------------------------
/pandas.md:
--------------------------------------------------------------------------------
1 | ### 1- Install version 1.1.4
2 | `pip install pandas==1.1.4`
3 |
4 | ### 2- Import
5 | `import pandas as pd`
6 |
7 | ### 3- Define data frame (Table with 2 rows and 3 columns)
8 | `df=pd.DataFrame([100,101,102], [1,2])`
9 | *to have indexed data frame:*
10 | `df=pd.DataFrame({'id':[100,101,102], 'color':['red','blue','green']}, index=['a','b','c'])`
11 | *to Define and Append data to it:*
12 | `df=pd.DataFrame(columns=["col1", "col2", ...])`
13 | `df.append({"col1":1, "col2":2, ...})`
14 |
15 | ### 4- Read csv - xlsx file
16 | `df=pd.read_csv('file-name.csv)`*to read just 10 rows:* `df=pd.read_csv('file_name.csv', nrows=10)`
17 | *to conver some values to Null:* `missing_val = ["na", "n/a", "-"]` `df=pd.read_csv('file-name.csv, na_values=missing_val)`
18 | `df=pd.read_excel('file-name.xlsx)` *to read just 10 rows:* `df=pd.read_excel('file_name.xlsx', nrows=10)`
19 | `df=pd.read_pickle('file_name.pkl')`
20 |
21 | ### 5- Save csv - xlsx file
22 | `df.to_csv('file-name.csv, index=False)`
23 | `df.to_excel('file-name.xlsx, index=False)`
24 | `df.to_pickle('file_name.pkl')`
25 |
26 | ### 6- Depict top and last 10 rows (default number is 5)
27 | `df.head()`
28 | `df.head(10)` *or* `df[:10]`
29 | `df.tail()`
30 | `df.tail(10)`
31 |
32 | ### 7- Depict column namely "Age" and 2nd row of it
33 | `df.Age` *or* `df['Age']`
34 | *indexes are started from 0* `df.['Age'][1]`
35 |
36 | ### 08- Add New Column namely "id" from index values
37 | `df['id']=df.index`
38 |
39 | ### 09- Copy whole data of df to temp
40 | `temp=df.copy()` *or* `temp=df` *for selecting columns namely A and B:* `temp=df[['A','B']]`
41 |
42 | ### 10- Shape or Size
43 | `df.shape`
44 |
45 | ### 11- Convert data to True and False based on being Not Null
46 | `df.notnull()`
47 |
48 | ### 12- Depict how many Null values are exist
49 | `df.isnull().sum()` *to depict true and false for bring null:* `df.isnull()`
50 |
51 | ### 13- Fill Null Values in column "Age" by 2 (for instance)
52 | `df['Age'].fillna(value=2, inplace=True)` *use inplace to make changes applyed - alternatiopn for inplace is:* `df['Age']=df['Age'].fillna(value=2)`
53 |
54 | ### 14- Remove Null Values in column "Age"
55 | *to remove any row that whole are Null:* `df['Age'].dropna(how='all', inplace=True)`
56 | *to remove any row contain even one Null values:* `df.dropna(how='any', inplace=True)`
57 | *to remove any row contain 5 Not Null values:* `df.dropna(thresh=5, axis=1, inplace=True)`
58 |
59 | ### 15- Remove Columns namely "Age" and "Name" and Remove Rows with index=0 and 2
60 | `df.drop(['Age', 'Name'], axis=1, inplace=True)`
61 | `df.drop([0, 2], axis=0, inplace=True)`
62 |
63 | ### 16- Count how many 2 values are in column "Age"
64 | `df[df.Age==2].count()`
65 | *To count them by multiple conditions:* `df[(df.Age>2) & (df.Age<10)].count()`
66 |
67 | ### 17- Replace A with another B in Column namely "Name"
68 | `df.Name.replace('A', 'B', inplace=True)` *or* `df.Name=df.Name.replace('A', 'B')`
69 |
70 | ### 18- Remove "A" from values of Column namely "Name" from right, left and both side of string
71 | *for right:* `df.Name.str.rstrip('A')`
72 | *for left:* `df.Name.str.lstrip('A')`
73 | *for both:* `df.Name.str.strip('A')`
74 |
75 | ### 19- Remove rows contain a value for strings and greater(or less) than a value for numbers
76 | `df=df[df.Name.contains('A')==Flase]` *for special signs like \* use \\**
77 | `df=df[df.Age>100]`
78 |
79 | ### 20- Change strings to Upper or Lower cases
80 | `df.Name.str.upper()`
81 | `df.Name.str.lower()`
82 |
83 | ### 21- Count group of values
84 | `df.Age.value_counts()`
85 |
86 | ### 22- Rename Column namely 'Age' to 'During'
87 | `df.rename(columns={'Age':'During'}, inplace=True)` *to rename whole Columns:* `df.columns=['a','b',...]`
88 |
89 | ### 23- Get columns name and data type
90 | `df.columns` *to show them in list:* `df.columns.tolist()`
91 | `df.dtypes` *or for "Age" column:* `df.Age.dtypes` *'str' is shown as object*
92 |
93 | ### 24- Conver data type e.g. string dtype of "Age" to float
94 | `df.Age.astype(float)`
95 |
96 | ### 25- Categorize column "Age" and get codes ang categories' name
97 | `df.Age=df.Age.astype('category')`
98 | `df.Age.cat.codes` *-1 is shown for NaN values*
99 | `df.Age.cat.categories`
100 |
101 | ### 26- Sort data by column namely "Age"
102 | `df.sort_values(by="Age", ascending=True, inplace=True)`
103 | *for one column:* `df.Age.sort_values()`
104 |
105 | ### 27- Depict memory usage of each columns
106 | `df.memory_usage(deep=True)`
107 |
108 | ### 28- Depict a general details of columns e.g. count, mean, std, min, etc.
109 | `df.describe()` *for bit shorter info:* `df.info()`
110 |
111 | ### 29- Depict count, mean, std, min, etc of data or each columns
112 | `df.max()` *or* `df.Age.max()` *to get index of them add arg to them:* `df.argmax()`
113 | `df.Age.value_counts(dropna=False)` *if do not want to count null values, change it to True (default is True)
114 | *for depicting in percantage mode (normalized), add* `normalize=True`
115 | *list of indexes* `df.Age.value_counts().index`
116 | *list of values* `df.Age.value_counts().values`
117 | *unique of values* `df.Age.uniques()`
118 |
119 | ### 30- Dammy variables
120 | *to change unique values of column Age:* `df.Age.map({1:'One', 2:'Two'})` *or* `df.Age={1:'One', 2:'Two'}`
121 | *to make a word bag specially for categories:* `df.get_dummies(df.Age)`
122 |
123 | ### 31- Choose the column Age as index of data frame
124 | `df.set_index("Age", inplace=True)`
125 | *to remove the name of index column or assign a name:* `df.inedx.name="AgeIndex"` `df.index.name=None`
126 | *to reset index values (0~ ...):* `df.reset_index(inplace=True)`
127 |
128 | ### 32- Access to rows by index of them
129 | `df.iloc[[10, 30]]` *min, max, mean, argmax , etc can be used instead of list*
130 | *to get value of special columns:*
131 | `df.iloc[[10]]["Age"]` *or* `df.iloc[[10]].Age` *or* `df.loc[10, "Age"]` *or* `df.at[10, "Age"]`
132 |
133 | ### 33- Filter data frame by column namely Age value
134 | `df[df.Age>10]` *to filter by more than one columns:* `df(df.Age>10) & (df.Name.contains('A'))]`
135 | *to depict just True or False for all of data:* `df.Age>10`
136 | *to check whether they exist in a list or not:* `df.Age.isin([10,20,30])`
137 |
138 | ### 34- Group data by a column value
139 | `df.groupby("Age")` *to count, mean, min, max, etc:* `df.groupby("Age").count()` *or* `df.groupby("Age").Name.sum()`
140 | *to get more details based by groups:* `df.groupby("Age").Age.agg(["count", "sum", "min", "max"])`
141 |
142 | ### 35- DataFrame Iteration
143 | ```
144 | for index, row in df.iterrows():
145 | codes...
146 | ```
147 |
148 | ### 36- Depict some rows randomly
149 | `df.sample(n=3, random_state=65)` *if want to have same results in different excution, use random_state*
150 | *to select 5 percent of whole data frame:* `df.sample(frac=0.05)`
151 | *if you want to shuffle whole data:* `df.sample(frac=1)`
152 |
153 | ### 37- Depict Duplicated values
154 | `df.duplicated()` *or for checking column namely Age:* `df.Age.duplicated()`
155 | *to get number of duplicated values:* `df.Age.duplicated().sum()`
156 | *to remove duplicated values:* `df.drop_duplicates(inplace=True)`
157 |
158 | ### 38- Functions (keyword: apply)
159 | *to get lenghth of a column in string dtype:* `df['name_lenghth']=df.Name.apply(len)`
160 | *to add 2 to Ages:* `df.Age+=2` *to make it as a function:*
161 | ```
162 | def get_element(num):
163 | return num+2
164 |
165 | df[temp_age]=df.temp_age.apply(get_element)
166 | ```
167 | *if want to change data more than 8 to 1 and less that it to 0:*
168 | ```
169 | import numpy as np
170 |
171 | df=pd.DataFrame()
172 | df['temp_age']=np.where(df.Age>8 ,1 ,0)
173 | ```
174 | *Choose 10 random samples of column Age with values greater than '50' and Name equal to 'Ali':*
175 | ```
176 | df1=df[df.Age>1].sample(n=10)
177 | df2=df[df.Name=='Ali'].sample(n=10)
178 |
179 | df_all=pd.concat([df1,df2]).sample(frac=1)
180 | ```
181 |
182 |
--------------------------------------------------------------------------------
/scikit-learn.md:
--------------------------------------------------------------------------------
1 | ### 1- Install
2 |
--------------------------------------------------------------------------------