├── README.md
├── numpy.md
├── pandas.md
└── scikit-learn.md


/README.md:
--------------------------------------------------------------------------------
 1 | # Code Snippet
 2 | 
 3 | ## Pandas
 4 | https://github.com/dmahdipour/code-snippet/blob/main/pandas.md
 5 | 
 6 | ## Numpy
 7 | https://github.com/dmahdipour/code-snippet/blob/main/numpy.md
 8 | 
 9 | ## Scikit-learn
10 | https://github.com/dmahdipour/code-snippet/blob/main/scikit-learn.md
11 | 


--------------------------------------------------------------------------------
/numpy.md:
--------------------------------------------------------------------------------
 1 | ### 1- Install 
 2 | `pip install numpy`
 3 | 
 4 | ### 2- Import
 5 | `import numpy as np`
 6 | 
 7 | ### 3- Define an array (Matrix with 2 rows and 3 columns)
 8 | `np.array([[1, 2, 3],[4, 5, 6]])` *If dimentions are not equal, we have list in array insted of numbers:* `np.array([[1, 2, ],[4, 5, 6]])`
 9 | 
10 | ### 4- Define an array (Ordinary Numbers)
11 | `np.arange(2, 9, 0.1)` *Start from: 2, end: 10, steps:0.1* <br>
12 | *Use int numbers for step if you want to have integer numbers:* `np.arange(2, 9, 3)` <br>
13 | `np.ones((3,4), dtype=int)` *Whole numbers are 1 and for 1D array:* `np.ones((3))` <ins>default dtype is float</ins> <br>
14 | *To have other numbers (e.g. 5) instead of 1:* `np.full((3,4), 5)` <br>
15 | `np.zeros((3,4), dtyp=int)` *Whole numbers are 0 and for 1D array:* `np.zeros((3))` <br>
16 | `np.eye(3)` *Create ab array with whole 1 in diameter and fill others by 0* <br>
17 | `np.linspace(1,4, num=10)` *Create an 10 items array start from 1 end by 4* <br>
18 | `np.random.random((3,4))` *Create an array with random items between 3 and 4 or get 9 random integer numbers between 0 and 10* `np.random.randint(10, size=9).reshape(3,3)` 
19 | 
20 | ### 5- Change Dimention's size of arrays
21 | `myArray=np.arange(12).reshape(3, 4)` *To revert 2D array to one:* `myArray.reshape(12)` *to convert it in 12 row:* `myArray.reshape(1,12)` <br>
22 | `myArray.flattern()` *Convert any arrays to 1D*
23 | 
24 | ### 6- Update Array's Values
25 | `myArray[2]=30` *or* `myArray[1:4]=10` *To update range of indexes:* `myArray[0:3]` *and for 2D:* `myArray[1][0:3]`
26 | 
27 | ### 7- Get Array's Dimention, Lenghth and Size:
28 | `myArray.ndim` *1 for 1D, 2 for 2D ,etc.* <br>
29 | `myArray.shape` *e.g. (3, 4) - to get rows number:* `myArray.shape[0]` *or* `len(myArray)` *and to get columns number:* `myArray.shape[1]` <br>
30 | `myArray.size` *Get count of whole items. e.g. 12*
31 | 
32 | ### 8- Convert List to Array
33 | ```
34 | myList=[[1,2,3],[4,5,6]]
35 | myArray=np.asarray(myList)
36 | ```
37 | 
38 | ### 9- Built-in Functions
39 | <ins>*Add axis=0 for finding values throught the columns and axis=1 for rows in 2D array*</ins> <br>
40 | `np.min(myArray, axis=0)` `np.max(myArray)` *Min and Max* <br>
41 | `np.argmin(myArray)` `np.argmax(myArray, axis=1)` *Indexes of Min and Max items* <br>
42 | `np.mean(myArray, axis=1)` *Average of whole items* <br>
43 | `np.median(myArray, axis=0)` *Get an item located in middel of array* <br>
44 | `np.std(myArray, axis=0)` *Standard Deviation of whole items* <br>
45 | `np.sum(myArray, axis=1)` *Sum whole items by columns or rows*
46 | 
47 | ### 10- Indexes and Accesses to item
48 | `myArray[1]` *or for 2D:* `myArray[1,2]` *First index is for row and second one for column* <br>
49 | `myArray[0:2,:]` *Get items in rows 0 to 2 that are in whole columns* <br>
50 | `myArray[:,1:]` *Get items in whole rows but columns start from 1* <br>
51 | `myArray[-3:]` *Get just 3 latest items* <br>
52 | `myArray[:-2]` *Get whole items but 2 latest ones*
53 | 
54 | ### 11- Loops
55 | ```
56 | for row in myArray:
57 |   print(row+3)
58 | 
59 | 
60 | for row in myArray:
61 |   for item in row:
62 |     print(item*3)
63 | ```
64 | 
65 | ### 12- Mathmatics an Logis
66 | `myArray.T` *Transpose* <br>
67 | `myArray+10` *Plus 10 to items one by one. Also, Mines, Multipy, Division, Pow ,etc acan be used.* <br>
68 | `myArray + myArray2` *Sum items of 2 arrays one by one* <br>
69 | `myArray>2` *Get True if our condition are true for each one and vise verse. To get items instead of True/False:* `myArray[myArray>2]` <br>
70 | `np.sqrt(myArray)` *Also Ceil, Sin, Cos, Tan, Arcsin, Arccos, Arctan and etc. can be used.* <br> 
71 | *for more:* https://numpy.org/doc/stable/reference/routines.math.html <br><br>
72 | `myArray.dot(myArray2)` *or* `np.dot(myArray, myArray2)` *or* `np.matmul(myArray, myArray2)` <br>
73 | <ins>*number of column in first array should be equal to number of rows in second one. e.g. 2x**3**  and  **3**x5*</ins> <br><br>
74 | `np.linalg.det(myArray)` *To get Determinant, array should be squar in dimention. This is a kind of Algebra calculation (linalg).*
75 | 
76 | ### 13- Mixup Arrays (Whole arrays should have same dimentions)
77 | `np.vstack((myArray, myArray2))` *Mix to array together Vertically. Rows of second array come under the first one.* <br>
78 | `np.hstack((myArray, myArray2))` *Mix to array together Horizontally. Rows of second array come near the first one.* <br>
79 | 


--------------------------------------------------------------------------------
/pandas.md:
--------------------------------------------------------------------------------
  1 | ### 1- Install version 1.1.4
  2 | `pip install pandas==1.1.4`
  3 | 
  4 | ### 2- Import
  5 | `import pandas as pd`
  6 | 
  7 | ### 3- Define data frame (Table with 2 rows and 3 columns)
  8 | `df=pd.DataFrame([100,101,102], [1,2])` <br>
  9 | *to have indexed data frame:* <br>
 10 | `df=pd.DataFrame({'id':[100,101,102], 'color':['red','blue','green']}, index=['a','b','c'])` <br>
 11 | *to Define and Append data to it:* <br>
 12 | `df=pd.DataFrame(columns=["col1", "col2", ...])` <br>
 13 | `df.append({"col1":1, "col2":2, ...})`
 14 | 
 15 | ### 4- Read csv - xlsx file
 16 | `df=pd.read_csv('file-name.csv)`*to read just 10 rows:* `df=pd.read_csv('file_name.csv', nrows=10)` <br>
 17 | *to conver some values to Null:* `missing_val = ["na", "n/a", "-"]` `df=pd.read_csv('file-name.csv, na_values=missing_val)`
 18 | `df=pd.read_excel('file-name.xlsx)` *to read just 10 rows:* `df=pd.read_excel('file_name.xlsx', nrows=10)` <br>
 19 | `df=pd.read_pickle('file_name.pkl')`
 20 | 
 21 | ### 5- Save csv - xlsx file
 22 | `df.to_csv('file-name.csv, index=False)` <br>
 23 | `df.to_excel('file-name.xlsx, index=False)` <br>
 24 | `df.to_pickle('file_name.pkl')`
 25 | 
 26 | ### 6- Depict top and last 10 rows (default number is 5)
 27 | `df.head()` <br>
 28 | `df.head(10)` *or* `df[:10]` <br>
 29 | `df.tail()` <br>
 30 | `df.tail(10)`
 31 | 
 32 | ### 7- Depict column namely "Age" and 2nd row of it
 33 | `df.Age` *or* `df['Age']`
 34 | *indexes are started from 0* `df.['Age'][1]`
 35 | 
 36 | ### 08- Add New Column namely "id" from index values
 37 | `df['id']=df.index`
 38 | 
 39 | ### 09- Copy whole data of df to temp
 40 | `temp=df.copy()` *or* `temp=df` *for selecting columns namely A and B:* `temp=df[['A','B']]`
 41 | 
 42 | ### 10- Shape or Size
 43 | `df.shape`
 44 | 
 45 | ### 11- Convert data to True and False based on being Not Null
 46 | `df.notnull()`
 47 | 
 48 | ### 12- Depict how many Null values are exist
 49 | `df.isnull().sum()` *to depict true and false for bring null:* `df.isnull()`
 50 | 
 51 | ### 13- Fill Null Values in column "Age" by 2 (for instance)
 52 | `df['Age'].fillna(value=2, inplace=True)` *use inplace to make changes applyed - alternatiopn for inplace is:* `df['Age']=df['Age'].fillna(value=2)`
 53 | 
 54 | ### 14- Remove Null Values in column "Age"
 55 | *to remove any row that whole are Null:* `df['Age'].dropna(how='all', inplace=True)` <br>
 56 | *to remove any row contain even one Null values:* `df.dropna(how='any', inplace=True)` <br>
 57 | *to remove any row contain 5 Not Null values:* `df.dropna(thresh=5, axis=1, inplace=True)` 
 58 | 
 59 | ### 15- Remove Columns namely "Age" and "Name" and Remove Rows with index=0 and 2
 60 | `df.drop(['Age', 'Name'], axis=1, inplace=True)` <br>
 61 | `df.drop([0, 2], axis=0, inplace=True)`
 62 | 
 63 | ### 16- Count how many 2 values are in column "Age"
 64 | `df[df.Age==2].count()` <br>
 65 | *To count them by multiple conditions:* `df[(df.Age>2) & (df.Age<10)].count()`
 66 | 
 67 | ### 17- Replace A with another B in Column namely "Name"
 68 | `df.Name.replace('A', 'B', inplace=True)` *or* `df.Name=df.Name.replace('A', 'B')`
 69 | 
 70 | ### 18- Remove "A" from values of Column namely "Name" from right, left and both side of string
 71 | *for right:* `df.Name.str.rstrip('A')` <br>
 72 | *for left:* `df.Name.str.lstrip('A')` <br>
 73 | *for both:* `df.Name.str.strip('A')`
 74 | 
 75 | ### 19- Remove rows contain a value for strings and greater(or less) than a value for numbers 
 76 | `df=df[df.Name.contains('A')==Flase]` *for special signs like \* use \\** <br>
 77 | `df=df[df.Age>100]`
 78 | 
 79 | ### 20- Change strings to Upper or Lower cases
 80 | `df.Name.str.upper()` <br>
 81 | `df.Name.str.lower()`
 82 | 
 83 | ### 21- Count group of values
 84 | `df.Age.value_counts()`
 85 | 
 86 | ### 22- Rename Column namely 'Age' to 'During'
 87 | `df.rename(columns={'Age':'During'}, inplace=True)` *to rename whole Columns:* `df.columns=['a','b',...]`
 88 | 
 89 | ### 23- Get columns name and data type
 90 | `df.columns` *to show them in list:* `df.columns.tolist()` <br>
 91 | `df.dtypes` *or for "Age" column:* `df.Age.dtypes` *'str' is shown as object*
 92 | 
 93 | ### 24- Conver data type e.g. string dtype of "Age" to float
 94 | `df.Age.astype(float)`
 95 | 
 96 | ### 25- Categorize column "Age" and get codes ang categories' name
 97 | `df.Age=df.Age.astype('category')` <br>
 98 | `df.Age.cat.codes` *-1 is shown for NaN values*<br>
 99 | `df.Age.cat.categories`
100 | 
101 | ### 26- Sort data by column namely "Age"
102 | `df.sort_values(by="Age", ascending=True, inplace=True)` <br>
103 | *for one column:* `df.Age.sort_values()`
104 | 
105 | ### 27- Depict memory usage of each columns
106 | `df.memory_usage(deep=True)`
107 | 
108 | ### 28- Depict a general details of columns e.g. count, mean, std, min, etc.
109 | `df.describe()` *for bit shorter info:* `df.info()`
110 | 
111 | ### 29- Depict count, mean, std, min, etc of data or each columns
112 | `df.max()` *or* `df.Age.max()` *to get index of them add arg to them:* `df.argmax()` <br>
113 | `df.Age.value_counts(dropna=False)` *if do not want to count null values, change it to True (default is True) <br>
114 | *for depicting in percantage mode (normalized), add* `normalize=True` <br>
115 | *list of indexes* `df.Age.value_counts().index` <br>
116 | *list of values* `df.Age.value_counts().values` <br>
117 | *unique of values* `df.Age.uniques()`
118 | 
119 | ### 30- Dammy variables
120 | *to change unique values of column Age:* `df.Age.map({1:'One', 2:'Two'})` *or* `df.Age={1:'One', 2:'Two'}` <br>
121 | *to make a word bag specially for categories:* `df.get_dummies(df.Age)`
122 | 
123 | ### 31- Choose the column Age as index of data frame
124 | `df.set_index("Age", inplace=True)` <br>
125 | *to remove the name of index column or assign a name:* `df.inedx.name="AgeIndex"` `df.index.name=None` <br>
126 | *to reset index values (0~ ...):* `df.reset_index(inplace=True)`
127 | 
128 | ### 32- Access to rows by index of them
129 | `df.iloc[[10, 30]]` *min, max, mean, argmax , etc can be used instead of list* <br>
130 | *to get value of special columns:* <br> 
131 | `df.iloc[[10]]["Age"]` *or* `df.iloc[[10]].Age` *or* `df.loc[10, "Age"]` *or* `df.at[10, "Age"]`
132 | 
133 | ### 33- Filter data frame by column namely Age value
134 | `df[df.Age>10]` *to filter by more than one columns:* `df(df.Age>10) & (df.Name.contains('A'))]` <br>
135 | *to depict just True or False for all of data:* `df.Age>10` <br>
136 | *to check whether they exist in a list or not:* `df.Age.isin([10,20,30])`
137 | 
138 | ### 34- Group data by a column value
139 | `df.groupby("Age")` *to count, mean, min, max, etc:* `df.groupby("Age").count()` *or* `df.groupby("Age").Name.sum()` <br>
140 | *to get more details based by groups:* `df.groupby("Age").Age.agg(["count", "sum", "min", "max"])`
141 | 
142 | ### 35- DataFrame Iteration
143 | ```
144 | for index, row in df.iterrows():
145 |   codes...
146 | ```
147 | 
148 | ### 36- Depict some rows randomly
149 | `df.sample(n=3, random_state=65)` *if want to have same results in different excution, use random_state* <br>
150 | *to select 5 percent of whole data frame:* `df.sample(frac=0.05)` <br>
151 | *if you want to shuffle whole data:* `df.sample(frac=1)`
152 | 
153 | ### 37- Depict Duplicated values
154 | `df.duplicated()` *or for checking column namely Age:* `df.Age.duplicated()` <br>
155 | *to get number of duplicated values:* `df.Age.duplicated().sum()` <br>
156 | *to remove duplicated values:* `df.drop_duplicates(inplace=True)`
157 | 
158 | ### 38- Functions (keyword: apply)
159 | *to get lenghth of a column in string dtype:* `df['name_lenghth']=df.Name.apply(len)` <br>
160 | *to add 2 to Ages:* `df.Age+=2` *to make it as a function:* <br>
161 | ```
162 | def get_element(num):
163 |   return num+2
164 |   
165 | df[temp_age]=df.temp_age.apply(get_element)
166 | ```
167 | *if want to change data more than 8 to 1 and less that it to 0:*
168 | ```
169 | import numpy as np
170 | 
171 | df=pd.DataFrame()
172 | df['temp_age']=np.where(df.Age>8 ,1 ,0)
173 | ```
174 | *Choose 10 random samples of column Age with values greater than '50' and Name equal to 'Ali':*
175 | ```
176 | df1=df[df.Age>1].sample(n=10)
177 | df2=df[df.Name=='Ali'].sample(n=10)
178 | 
179 | df_all=pd.concat([df1,df2]).sample(frac=1)
180 | ```
181 | 
182 | 


--------------------------------------------------------------------------------
/scikit-learn.md:
--------------------------------------------------------------------------------
1 | ### 1- Install 
2 | 


--------------------------------------------------------------------------------