├── 01 Day Getting Started With R └── README.md ├── 02 Day R Comments └── README.md ├── 03 Day R Variables and Constants ├── README.md └── Variable.r ├── 04 Day R Data Types ├── CharacterDataType.r ├── ComplexDataTypes.r ├── DateTimeDataTypes.r ├── FactorDataTypes.r ├── IntegerDataType.r ├── LogicalDataTypes.r ├── NumericDataTypes.r └── README.md ├── 05 Day R Print Output & User Input ├── PrintOutputExample.r └── README.md ├── 06 Day R Operators ├── ArithmeticOperators.r ├── AssignmentOperators.r ├── ComparisonOperators.r ├── LogicalOperators.r ├── MiscellaneousOperators.r └── README.md ├── 07 Day R if...else ├── Checking_Positive_Negative_Number.r ├── Example1Simple.r ├── Example2Nested.r ├── GradeClassification.r └── README.md ├── 08 Day R ifelse() Function ├── Example1.r ├── Example2.r ├── Example3.r ├── Example4.r └── README.md ├── 09 Day R while Loop ├── Example1.r ├── Example2.r ├── Example3.r ├── Example4.r └── README.md ├── 10 Day R for Loop ├── Example1.r ├── Example2.r ├── Example3.r ├── Example4.r └── README.md ├── 11 Day R repeat Loop ├── README.md ├── ReversingString.r └── UserInputValidation.r ├── 12 Day R break and next ├── Example1.r ├── Example2.r ├── Example3.r └── README.md ├── 13 Day R Functions ├── Closure Function.r ├── Generator Function.r ├── Mapping and Reducing Functions.r ├── README.md └── Recursive Function.r ├── 15 Day R Strings └── README.md ├── 16 Day R Vectors ├── README.md └── Vectors.r ├── 17 Day R Matrix ├── Matrix.r ├── Question.md └── README.md ├── 18 Day R List ├── List Methods.md ├── README.md └── list.r ├── 19 Day R Array ├── README.md └── array.r ├── 20 Day R Data Frame ├── Advanced-level practice questions.md ├── Practice_Questions.r ├── README.md └── dataframe.r ├── 21 Day R Factors ├── README.md └── factors.r ├── 22 Day R Data Visualization ├── R Bar Plot │ ├── Example.md │ ├── R bar Plot.md │ └── README.md ├── R Boxplot │ ├── Example.md │ └── README.md ├── R Histogram │ ├── Example.md │ └── README.md └── R Pie Chart │ ├── Advanced customizations and explanations.md │ ├── Example01.md │ └── README.md ├── Case Study ├── 01.md ├── Data Frame Practice Questions.md ├── DataSet.md ├── Predicting EMI Loan Default.md ├── Predicting EMI Loan Default_Question.md ├── Predicting Employee Attrition.md └── mtcars_Case Study Question.md ├── README.md └── R_Programming_Example.ipynb /01 Day Getting Started With R/README.md: -------------------------------------------------------------------------------- 1 | # Write and execute R code in a Jupyter Notebook. 2 | 3 | To write and execute R code in a Jupyter Notebook, you'll need to follow these steps: 4 | 5 | 1. **Install R and Jupyter**: Ensure you have R and Jupyter Notebook installed on your system. If you haven't already, you can install them using a package manager like `conda` or `pip`: 6 | 7 | ``` 8 | # Install Jupyter Notebook using pip 9 | pip install notebook 10 | 11 | # Install R 12 | # On Ubuntu or Debian-based systems 13 | sudo apt-get install r-base 14 | ``` 15 | 16 | 2. **Install R Kernel for Jupyter**: To use R in Jupyter Notebook, you'll need to install the R kernel. You can do this using the `IRkernel` package in R: 17 | 18 | Open an R session in your terminal or R console and run: 19 | 20 | ```R 21 | install.packages("IRkernel") 22 | IRkernel::installspec() 23 | ``` 24 | 25 | 3. **Launch Jupyter Notebook**: Start Jupyter Notebook by running the following command in your terminal: 26 | 27 | ``` 28 | jupyter notebook 29 | ``` 30 | 31 | This will open a new browser window/tab with the Jupyter Notebook interface. 32 | 33 | 4. **Create a New Notebook**: In the Jupyter Notebook interface, click the "New" button and select "R" from the dropdown menu. This will create a new notebook with an R kernel. 34 | 35 | 5. **Write and Execute R Code**: In the notebook, you'll see cells where you can write and execute R code. You can change the cell type to "Code" if it's not already by selecting "Code" from the dropdown menu in the toolbar. Then, you can write your R code in the cell and execute it by pressing Shift+Enter or clicking the "Run" button in the toolbar. 36 | 37 | For example: 38 | 39 | ```R 40 | # This is an R code cell 41 | x <- c(1, 2, 3, 4, 5) 42 | mean_x <- mean(x) 43 | mean_x 44 | ``` 45 | 46 | After running the cell, the output will be displayed below it. 47 | 48 | 6. **Markdown Cells**: You can also insert Markdown cells to add explanations, documentation, or formatted text to your notebook. Select "Markdown" from the dropdown menu and write Markdown content in these cells. 49 | 50 | 7. **Save Your Notebook**: Don't forget to save your work regularly by clicking the save icon in the toolbar or pressing Ctrl+S (Cmd+S on macOS). 51 | 52 | 8. **Close and Shut Down**: When you're done, close the Jupyter Notebook tab in your browser. To shut down the notebook server, go to the terminal where Jupyter Notebook is running and press Ctrl+C. Confirm the shutdown if prompted. 53 | 54 | That's it! You can now write and execute R code in Jupyter Notebook using the R kernel. Jupyter Notebook is a versatile environment that allows you to mix code, documentation, and visualizations in a single document, making it a powerful tool for data analysis and exploration. 55 | -------------------------------------------------------------------------------- /02 Day R Comments/README.md: -------------------------------------------------------------------------------- 1 | # R Comments 2 | 3 | In R, comments are used to add explanatory or descriptive notes within your code. Comments are ignored by the R interpreter and are not executed as part of the program. They are solely for human readers, including yourself and other developers, to understand the code. Here's how you can add comments in R: 4 | 5 | 1. **Single-line Comments**: To add a single-line comment in R, you can use the `#` symbol. Anything following the `#` on the same line is treated as a comment and is not executed. 6 | 7 | ```R 8 | # This is a single-line comment 9 | x <- 10 # This comment explains the purpose of the following code 10 | ``` 11 | 12 | 2. **Multi-line Comments**: R does not have a built-in syntax for multi-line comments like some other programming languages do. However, you can achieve multi-line comments by using the `#` symbol at the beginning of each line. 13 | 14 | ```R 15 | # This is a multi-line comment 16 | # It spans multiple lines by using a '#' at the beginning of each line. 17 | ``` 18 | 19 | Another way to create multi-line comments is by enclosing the text within `'''` (triple single-quotes) or `"""` (triple double-quotes). This is a convention, and the text is not assigned to any variable. 20 | 21 | ```R 22 | ''' 23 | This is another way to create 24 | a multi-line comment in R. 25 | ''' 26 | ``` 27 | 28 | or 29 | 30 | ```R 31 | """ 32 | This is yet another way to create 33 | a multi-line comment in R. 34 | """ 35 | ``` 36 | 37 | 3. **Commenting Out Code**: Comments are often used to temporarily disable or "comment out" a section of code for debugging or testing purposes. You can comment out multiple lines of code by adding `#` to the beginning of each line. 38 | 39 | ```R 40 | # This code is temporarily disabled 41 | # x <- 5 42 | # y <- 10 43 | ``` 44 | 45 | 4. **Inline Comments**: You can also add comments inline with code to provide explanations for specific statements. Inline comments should be placed after the code on the same line. 46 | 47 | ```R 48 | result <- x + y # Calculate the sum of x and y 49 | ``` 50 | 51 | 5. **Documenting Functions**: When you write functions in R, it's a good practice to use comments to provide documentation for the function. You can describe what the function does, its parameters, and its return values using comments. 52 | 53 | ```R 54 | # This function calculates the factorial of a non-negative integer n. 55 | # Parameters: 56 | # n: The input integer. 57 | # Returns: 58 | # The factorial of n. 59 | factorial <- function(n) { 60 | if (n == 0) { 61 | return(1) # The factorial of 0 is defined as 1. 62 | } else { 63 | return(n * factorial(n - 1)) 64 | } 65 | } 66 | ``` 67 | 68 | Comments play an essential role in making your R code more readable and understandable to both yourself and others who may read your code. They help document your code, explain your thought process, and provide context for your programming decisions. 69 | -------------------------------------------------------------------------------- /03 Day R Variables and Constants/README.md: -------------------------------------------------------------------------------- 1 | # R Variables and Constants 2 | 3 | Welcome to www.codeswithpankaj.com! In this tutorial, we will explore the concepts of variables and constants in R, a powerful programming language used for statistical computing and graphics. Understanding these fundamental concepts is crucial for efficient programming in R. 4 | 5 | ## What are Variables in R? 6 | 7 | Variables are used to store data values in R. They act as containers that hold information which can be manipulated and referenced throughout the program. Variables in R are dynamic, meaning they can hold different types of data at different times. 8 | 9 | ### Types of Variables 10 | 11 | 1. **Numeric Variables** 12 | 2. **Character Variables** 13 | 3. **Logical Variables** 14 | 4. **Complex Variables** 15 | 5. **Vector Variables** 16 | 6. **List Variables** 17 | 18 | Let's delve into each type with examples. 19 | 20 | ### 1. Numeric Variables 21 | 22 | Numeric variables store numbers. These can be integers or floating-point numbers. 23 | 24 | **Example:** 25 | ```R 26 | # Integer 27 | x <- 10 28 | print(x) 29 | 30 | # Floating-point 31 | y <- 10.5 32 | print(y) 33 | ``` 34 | 35 | ### 2. Character Variables 36 | 37 | Character variables store text (strings). 38 | 39 | **Example:** 40 | ```R 41 | # Single character 42 | name <- "Pankaj" 43 | print(name) 44 | 45 | # Multiple characters 46 | greeting <- "Hello, welcome to www.codeswithpankaj.com" 47 | print(greeting) 48 | ``` 49 | 50 | ### 3. Logical Variables 51 | 52 | Logical variables store Boolean values: `TRUE` or `FALSE`. 53 | 54 | **Example:** 55 | ```R 56 | # Logical values 57 | is_true <- TRUE 58 | print(is_true) 59 | 60 | is_false <- FALSE 61 | print(is_false) 62 | ``` 63 | 64 | ### 4. Complex Variables 65 | 66 | Complex variables store complex numbers, which have real and imaginary parts. 67 | 68 | **Example:** 69 | ```R 70 | # Complex number 71 | z <- 2 + 3i 72 | print(z) 73 | ``` 74 | 75 | ### 5. Vector Variables 76 | 77 | Vectors are sequences of data elements of the same basic type. They are one-dimensional arrays. 78 | 79 | **Example:** 80 | ```R 81 | # Numeric vector 82 | numbers <- c(1, 2, 3, 4, 5) 83 | print(numbers) 84 | 85 | # Character vector 86 | words <- c("apple", "banana", "cherry") 87 | print(words) 88 | ``` 89 | 90 | ### 6. List Variables 91 | 92 | Lists can store different types of elements (numbers, strings, vectors) and are used for more complex data structures. 93 | 94 | **Example:** 95 | ```R 96 | # List with different types of elements 97 | my_list <- list(name = "Pankaj", age = 28, scores = c(85, 90, 95)) 98 | print(my_list) 99 | ``` 100 | 101 | ## Constants in R 102 | 103 | Constants are fixed values that do not change during the execution of a program. In R, constants can be defined using the `const` keyword in some languages, but in R, we typically define constants using variables and avoid modifying them. 104 | 105 | ### Types of Constants 106 | 107 | 1. **Numeric Constants** 108 | 2. **Character Constants** 109 | 3. **Logical Constants** 110 | 111 | Let's see examples for each. 112 | 113 | ### 1. Numeric Constants 114 | 115 | Numeric constants can be integers or floating-point numbers. 116 | 117 | **Example:** 118 | ```R 119 | # Numeric constant 120 | PI <- 3.14159 121 | print(PI) 122 | ``` 123 | 124 | ### 2. Character Constants 125 | 126 | Character constants are string literals. 127 | 128 | **Example:** 129 | ```R 130 | # Character constant 131 | URL <- "www.codeswithpankaj.com" 132 | print(URL) 133 | ``` 134 | 135 | ### 3. Logical Constants 136 | 137 | Logical constants are `TRUE` and `FALSE`. 138 | 139 | **Example:** 140 | ```R 141 | # Logical constant 142 | IS_ACTIVE <- TRUE 143 | print(IS_ACTIVE) 144 | ``` 145 | 146 | ## Conclusion 147 | 148 | In this tutorial, we explored variables and constants in R, understanding their types and how to use them effectively. Mastering these concepts is essential for any R programmer, as they form the backbone of data manipulation and storage in your programs. 149 | 150 | Keep practicing and experimenting with different types of variables and constants to deepen your understanding. Happy coding! 151 | 152 | --- 153 | 154 | For more tutorials and guides, visit [www.codeswithpankaj.com](http://www.codeswithpankaj.com). 155 | 156 | --- 157 | -------------------------------------------------------------------------------- /03 Day R Variables and Constants/Variable.r: -------------------------------------------------------------------------------- 1 | # Numeric Variable 2 | age <- 25 3 | 4 | # Character Variable 5 | name <- "John" 6 | 7 | # Logical Variable 8 | is_student <- TRUE 9 | 10 | # Vector Variable 11 | scores <- c(85, 90, 78, 92, 88) 12 | 13 | # Data Frame Variable 14 | data <- data.frame(Name = c("Alice", "Bob", "Charlie"), 15 | Age = c(28, 35, 22)) 16 | 17 | # Matrix Variable 18 | mat <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3) 19 | 20 | # List Variable 21 | person <- list(Name = "Alice", Age = 28, isStudent = TRUE) 22 | 23 | # Factor Variable (Categorical Data) 24 | gender <- factor(c("Male", "Female", "Male", "Female")) 25 | 26 | # Date Variable 27 | birthdate <- as.Date("1995-03-15") 28 | 29 | # Complex Number Variable 30 | complex_num <- 3 + 2i 31 | 32 | # Printing the variables 33 | cat("Numeric Variable (age):", age, "\n") 34 | cat("Character Variable (name):", name, "\n") 35 | cat("Logical Variable (is_student):", is_student, "\n") 36 | cat("Vector Variable (scores):", scores, "\n") 37 | cat("Data Frame Variable (data):\n") 38 | print(data) 39 | cat("Matrix Variable (mat):\n") 40 | print(mat) 41 | cat("List Variable (person):\n") 42 | print(person) 43 | cat("Factor Variable (gender):", gender, "\n") 44 | cat("Date Variable (birthdate):", birthdate, "\n") 45 | cat("Complex Number Variable (complex_num):", complex_num, "\n") 46 | -------------------------------------------------------------------------------- /04 Day R Data Types/CharacterDataType.r: -------------------------------------------------------------------------------- 1 | # Character Variables 2 | name <- "p4n" 3 | greeting <- 'codeswithpankaj' 4 | 5 | # Concatenation 6 | full_greeting <- paste(greeting, "My name is", name) 7 | 8 | # Subsetting characters 9 | first_letter <- substr(name, 1, 1) 10 | last_name <- substr(name, 3, 5) 11 | 12 | # String manipulation 13 | uppercase_name <- toupper(name) 14 | lowercase_greeting <- tolower(greeting) 15 | 16 | # Print the results 17 | cat("Character Variable (name):", name, "\n") 18 | cat("Character Variable (greeting):", greeting, "\n") 19 | cat("Full Greeting:", full_greeting, "\n") 20 | cat("First Letter:", first_letter, "\n") 21 | cat("Last Name:", last_name, "\n") 22 | cat("Uppercase Name:", uppercase_name, "\n") 23 | cat("Lowercase Greeting:", lowercase_greeting, "\n") 24 | -------------------------------------------------------------------------------- /04 Day R Data Types/ComplexDataTypes.r: -------------------------------------------------------------------------------- 1 | # Complex Variable 2 | complex_num <- 3 + 2i 3 | 4 | # Accessing Real and Imaginary Parts 5 | real_part <- Re(complex_num) 6 | imaginary_part <- Im(complex_num) 7 | 8 | # Complex Arithmetic Operations 9 | another_complex_num <- 1 - 4i 10 | 11 | # Addition 12 | sum_complex <- complex_num + another_complex_num 13 | 14 | # Subtraction 15 | diff_complex <- complex_num - another_complex_num 16 | 17 | # Multiplication 18 | prod_complex <- complex_num * another_complex_num 19 | 20 | # Division 21 | div_complex <- complex_num / another_complex_num 22 | 23 | # Print the results 24 | cat("Complex Variable (complex_num):", complex_num, "\n") 25 | cat("Real Part:", real_part, "\n") 26 | cat("Imaginary Part:", imaginary_part, "\n") 27 | cat("Complex Addition:", sum_complex, "\n") 28 | cat("Complex Subtraction:", diff_complex, "\n") 29 | cat("Complex Multiplication:", prod_complex, "\n") 30 | cat("Complex Division:", div_complex, "\n") 31 | -------------------------------------------------------------------------------- /04 Day R Data Types/DateTimeDataTypes.r: -------------------------------------------------------------------------------- 1 | # Date Variable 2 | birthdate <- as.Date("1995-03-15") 3 | 4 | # Current Date 5 | current_date <- Sys.Date() 6 | 7 | # Time Variable (POSIXct) 8 | current_time <- Sys.time() 9 | 10 | # Formatting Dates 11 | formatted_date <- format(birthdate, "%Y-%m-%d") 12 | formatted_time <- format(current_time, "%Y-%m-%d %H:%M:%S") 13 | 14 | # Date Arithmetic 15 | age_years <- as.numeric(difftime(current_date, birthdate, units = "days") / 365) 16 | 17 | # Print the results 18 | cat("Date Variable (birthdate):", birthdate, "\n") 19 | cat("Current Date:", current_date, "\n") 20 | cat("Current Time (POSIXct):", current_time, "\n") 21 | cat("Formatted Date:", formatted_date, "\n") 22 | cat("Formatted Time:", formatted_time, "\n") 23 | cat("Age in Years:", age_years, "\n") 24 | -------------------------------------------------------------------------------- /04 Day R Data Types/FactorDataTypes.r: -------------------------------------------------------------------------------- 1 | # Factor Variable 2 | gender <- factor(c("Male", "Female", "Male", "Female")) 3 | 4 | # Levels of the Factor 5 | levels(gender) 6 | 7 | # Summary of the Factor 8 | summary(gender) 9 | 10 | # Frequency of Each Level 11 | table(gender) 12 | 13 | # Accessing Levels 14 | first_gender <- gender[1] 15 | second_gender <- gender[2] 16 | 17 | # Reordering Levels 18 | gender <- factor(gender, levels = c("Female", "Male")) 19 | 20 | # Summary After Reordering 21 | summary(gender) 22 | -------------------------------------------------------------------------------- /04 Day R Data Types/IntegerDataType.r: -------------------------------------------------------------------------------- 1 | # Integer Variable 2 | count <- as.integer(42) 3 | 4 | # Arithmetic Operations 5 | num1 <- 20L # 'L' suffix indicates an integer literal 6 | num2 <- 10L 7 | 8 | # Addition 9 | sum_result <- num1 + num2 10 | 11 | # Subtraction 12 | diff_result <- num1 - num2 13 | 14 | # Multiplication 15 | prod_result <- num1 * num2 16 | 17 | # Division 18 | div_result <- num1 / num2 19 | 20 | # Exponentiation 21 | exp_result <- num1^2 22 | 23 | # Modulus (remainder) 24 | mod_result <- num1 %% num2 25 | 26 | # Print the results 27 | cat("Integer Variable (count):", count, "\n") 28 | cat("Addition:", sum_result, "\n") 29 | cat("Subtraction:", diff_result, "\n") 30 | cat("Multiplication:", prod_result, "\n") 31 | cat("Division:", div_result, "\n") 32 | cat("Exponentiation:", exp_result, "\n") 33 | cat("Modulus (remainder):", mod_result, "\n") 34 | -------------------------------------------------------------------------------- /04 Day R Data Types/LogicalDataTypes.r: -------------------------------------------------------------------------------- 1 | # Logical Variables 2 | is_student <- TRUE 3 | is_adult <- FALSE 4 | 5 | # Logical Operations 6 | logical_and <- is_student & is_adult 7 | logical_or <- is_student | is_adult 8 | logical_not_student <- !is_student 9 | 10 | # Print the results 11 | cat("Is Student:", is_student, "\n") 12 | cat("Is Adult:", is_adult, "\n") 13 | cat("Logical AND (Student AND Adult):", logical_and, "\n") 14 | cat("Logical OR (Student OR Adult):", logical_or, "\n") 15 | cat("Logical NOT (NOT Student):", logical_not_student, "\n") 16 | -------------------------------------------------------------------------------- /04 Day R Data Types/NumericDataTypes.r: -------------------------------------------------------------------------------- 1 | # Integer 2 | x <- 42 3 | 4 | # Double (floating-point) 5 | y <- 3.1415 6 | 7 | # Arithmetic operations 8 | a <- 10 9 | b <- 5 10 | 11 | # Addition 12 | sum_result <- a + b 13 | 14 | # Subtraction 15 | diff_result <- a - b 16 | 17 | # Multiplication 18 | prod_result <- a * b 19 | 20 | # Division 21 | div_result <- a / b 22 | 23 | # Exponentiation 24 | exp_result <- a^b 25 | 26 | # Modulus (remainder) 27 | mod_result <- a %% b 28 | 29 | # Print the results 30 | cat("Integer (x):", x, "\n") 31 | cat("Double (y):", y, "\n") 32 | cat("Sum:", sum_result, "\n") 33 | cat("Difference:", diff_result, "\n") 34 | cat("Product:", prod_result, "\n") 35 | cat("Division:", div_result, "\n") 36 | cat("Exponentiation:", exp_result, "\n") 37 | cat("Modulus (remainder):", mod_result, "\n") 38 | -------------------------------------------------------------------------------- /04 Day R Data Types/README.md: -------------------------------------------------------------------------------- 1 | # R Data Types Tutorial 2 | 3 | Welcome to www.codeswithpankaj.com! In this tutorial, we will explore the different data types in R, an essential aspect of programming in this language. Understanding R's data types will enable you to manage and manipulate data efficiently. 4 | 5 | ## What are Data Types in R? 6 | 7 | Data types in R define the kind of data that can be stored and manipulated within a variable. They determine the operations that can be performed on the data and how it is stored in memory. 8 | 9 | ### Primary Data Types in R 10 | 11 | 1. **Numeric** 12 | 2. **Integer** 13 | 3. **Complex** 14 | 4. **Character** 15 | 5. **Logical** 16 | 6. **Raw** 17 | 18 | Let's examine each data type with examples. 19 | 20 | ### 1. Numeric 21 | 22 | Numeric data type is used for numbers, which can be either integers or floating-point numbers. 23 | 24 | **Example:** 25 | ```R 26 | # Numeric (double) 27 | num1 <- 42.5 28 | print(num1) 29 | 30 | # Integer 31 | num2 <- 42L 32 | print(num2) 33 | ``` 34 | 35 | ### 2. Integer 36 | 37 | Integer data type stores whole numbers. To explicitly declare an integer, append `L` to the number. 38 | 39 | **Example:** 40 | ```R 41 | # Integer 42 | int_num <- 10L 43 | print(int_num) 44 | ``` 45 | 46 | ### 3. Complex 47 | 48 | Complex data type is used for complex numbers, which have real and imaginary parts. 49 | 50 | **Example:** 51 | ```R 52 | # Complex number 53 | comp_num <- 3 + 4i 54 | print(comp_num) 55 | ``` 56 | 57 | ### 4. Character 58 | 59 | Character data type is used to store text (strings). 60 | 61 | **Example:** 62 | ```R 63 | # Single character 64 | char1 <- "R" 65 | print(char1) 66 | 67 | # String 68 | char2 <- "Welcome to www.codeswithpankaj.com" 69 | print(char2) 70 | ``` 71 | 72 | ### 5. Logical 73 | 74 | Logical data type is used for Boolean values: `TRUE` or `FALSE`. 75 | 76 | **Example:** 77 | ```R 78 | # Logical value 79 | is_true <- TRUE 80 | print(is_true) 81 | 82 | is_false <- FALSE 83 | print(is_false) 84 | ``` 85 | 86 | ### 6. Raw 87 | 88 | Raw data type is used to store raw bytes. 89 | 90 | **Example:** 91 | ```R 92 | # Raw data 93 | raw_data <- charToRaw("R") 94 | print(raw_data) 95 | ``` 96 | 97 | ## Compound Data Types in R 98 | 99 | R also supports compound data types, which can store multiple values of the same or different types. 100 | 101 | ### 1. Vectors 102 | 103 | Vectors are sequences of data elements of the same type. They can be numeric, character, or logical. 104 | 105 | **Example:** 106 | ```R 107 | # Numeric vector 108 | num_vector <- c(1, 2, 3, 4, 5) 109 | print(num_vector) 110 | 111 | # Character vector 112 | char_vector <- c("apple", "banana", "cherry") 113 | print(char_vector) 114 | 115 | # Logical vector 116 | log_vector <- c(TRUE, FALSE, TRUE) 117 | print(log_vector) 118 | ``` 119 | 120 | ### 2. Lists 121 | 122 | Lists can store different types of elements, including numbers, strings, vectors, and other lists. 123 | 124 | **Example:** 125 | ```R 126 | # List with different types of elements 127 | my_list <- list(name = "Pankaj", age = 28, scores = c(85, 90, 95)) 128 | print(my_list) 129 | ``` 130 | 131 | ### 3. Matrices 132 | 133 | Matrices are two-dimensional arrays where each element has the same data type. 134 | 135 | **Example:** 136 | ```R 137 | # Numeric matrix 138 | matrix_data <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3) 139 | print(matrix_data) 140 | ``` 141 | 142 | ### 4. Data Frames 143 | 144 | Data frames are tables or 2D arrays where each column can contain different types of data. 145 | 146 | **Example:** 147 | ```R 148 | # Data frame 149 | data <- data.frame( 150 | name = c("John", "Jane", "Doe"), 151 | age = c(28, 34, 23), 152 | score = c(85, 90, 88) 153 | ) 154 | print(data) 155 | ``` 156 | 157 | ### 5. Factors 158 | 159 | Factors are used to represent categorical data and store it as levels. 160 | 161 | **Example:** 162 | ```R 163 | # Factor 164 | gender <- factor(c("male", "female", "male", "female")) 165 | print(gender) 166 | ``` 167 | 168 | ### 6. Arrays 169 | 170 | Arrays can store data in more than two dimensions. 171 | 172 | **Example:** 173 | ```R 174 | # 3D Array 175 | array_data <- array(1:24, dim = c(3, 4, 2)) 176 | print(array_data) 177 | ``` 178 | 179 | ## Conclusion 180 | 181 | In this tutorial, we explored the various data types available in R, including primary and compound data types. Understanding these data types is crucial for efficient data management and manipulation in R. Keep practicing to become proficient in handling different data types in your R programs. 182 | 183 | For more tutorials and guides, visit [www.codeswithpankaj.com](http://www.codeswithpankaj.com). 184 | 185 | --- 186 | 187 | Happy coding! 188 | -------------------------------------------------------------------------------- /05 Day R Print Output & User Input/PrintOutputExample.r: -------------------------------------------------------------------------------- 1 | # Numeric Variable 2 | x <- 42 3 | 4 | # Character Variable 5 | name <- "Alice" 6 | 7 | # Logical Variable 8 | is_student <- TRUE 9 | 10 | # Vector Variable 11 | numbers <- c(1, 2, 3, 4, 5) 12 | 13 | # Printing using 'print()' function 14 | print("Printing using 'print()' function:") 15 | print(x) 16 | print(name) 17 | print(is_student) 18 | print(numbers) 19 | 20 | # Printing using 'cat()' function 21 | cat("\nPrinting using 'cat()' function:\n") 22 | cat("x:", x, "\n") 23 | cat("Name:", name, "\n") 24 | cat("Is Student:", is_student, "\n") 25 | cat("Numbers:", numbers, "\n") 26 | 27 | # Printing using 'paste()' function 28 | cat("\nPrinting using 'paste()' function:\n") 29 | paste_output <- paste("Name:", name, ", Age:", x) 30 | cat(paste_output, "\n") 31 | 32 | # Printing using 'sprintf()' function 33 | cat("\nPrinting using 'sprintf()' function:\n") 34 | sprintf_output <- sprintf("Name: %s, Age: %d", name, x) 35 | cat(sprintf_output, "\n") 36 | -------------------------------------------------------------------------------- /05 Day R Print Output & User Input/README.md: -------------------------------------------------------------------------------- 1 | # R Print Output and Input Tutorial 2 | 3 | Welcome to www.codeswithpankaj.com! In this tutorial, we will explore how to handle output and input in R, a fundamental aspect of interacting with users and displaying information in your programs. 4 | 5 | ## Printing Output in R 6 | 7 | Printing output is essential for debugging and communicating results to users. R provides several functions to print output to the console. 8 | 9 | ### 1. `print()` Function 10 | 11 | The `print()` function is the most basic way to display output in R. 12 | 13 | **Example:** 14 | ```R 15 | # Using print() function 16 | x <- 42 17 | print(x) 18 | 19 | message <- "Welcome to www.codeswithpankaj.com" 20 | print(message) 21 | ``` 22 | 23 | ### 2. `cat()` Function 24 | 25 | The `cat()` function concatenates and prints objects. It is useful for combining multiple items into a single output. 26 | 27 | **Example:** 28 | ```R 29 | # Using cat() function 30 | name <- "Pankaj" 31 | age <- 28 32 | cat("Name:", name, "\nAge:", age) 33 | ``` 34 | 35 | ### 3. `paste()` Function 36 | 37 | The `paste()` function concatenates strings and returns a single string. The result can be printed using `print()` or `cat()`. 38 | 39 | **Example:** 40 | ```R 41 | # Using paste() function 42 | greeting <- paste("Hello,", "welcome to www.codeswithpankaj.com!") 43 | print(greeting) 44 | 45 | # Using cat() to print paste() result 46 | cat(paste("Hello,", name, "you are", age, "years old.\n")) 47 | ``` 48 | 49 | ### 4. `sprintf()` Function 50 | 51 | The `sprintf()` function is used for formatted output. It is similar to the `printf` function in C. 52 | 53 | **Example:** 54 | ```R 55 | # Using sprintf() function 56 | formatted_output <- sprintf("Name: %s, Age: %d", name, age) 57 | print(formatted_output) 58 | 59 | # Using cat() to print sprintf() result 60 | cat(sprintf("Name: %s, Age: %d\n", name, age)) 61 | ``` 62 | 63 | ## Taking Input in R 64 | 65 | Taking input from the user allows for interactive programs. The `readline()` function is commonly used to take input from the console. 66 | 67 | ### 1. `readline()` Function 68 | 69 | The `readline()` function reads a line of input from the user as a string. 70 | 71 | **Example:** 72 | ```R 73 | # Using readline() function to take input 74 | name <- readline(prompt = "Enter your name: ") 75 | print(paste("Hello,", name)) 76 | 77 | age <- readline(prompt = "Enter your age: ") 78 | age <- as.numeric(age) # Convert input to numeric 79 | print(paste("You are", age, "years old.")) 80 | ``` 81 | 82 | ### 2. `scan()` Function 83 | 84 | The `scan()` function reads input and can be used to read multiple values at once. 85 | 86 | **Example:** 87 | ```R 88 | # Using scan() function to take multiple inputs 89 | numbers <- scan(what = numeric(), nmax = 3, quiet = TRUE) 90 | print(numbers) 91 | 92 | # Using scan() function to read a single value 93 | single_value <- scan(what = numeric(), nmax = 1, quiet = TRUE) 94 | print(single_value) 95 | ``` 96 | 97 | ## Combining Input and Output 98 | 99 | Combining input and output allows for creating interactive scripts. 100 | 101 | **Example:** 102 | ```R 103 | # Interactive script example 104 | name <- readline(prompt = "Enter your name: ") 105 | age <- readline(prompt = "Enter your age: ") 106 | age <- as.numeric(age) 107 | 108 | # Output 109 | cat(sprintf("Hello, %s! You are %d years old.\n", name, age)) 110 | ``` 111 | 112 | ## Conclusion 113 | 114 | In this tutorial, we explored how to handle output and input in R. We covered various functions to print output to the console, including `print()`, `cat()`, `paste()`, and `sprintf()`. We also learned how to take user input using `readline()` and `scan()` functions. Mastering these functions is essential for creating interactive and user-friendly R programs. 115 | 116 | For more tutorials and guides, visit [www.codeswithpankaj.com](http://www.codeswithpankaj.com). 117 | 118 | --- 119 | -------------------------------------------------------------------------------- /06 Day R Operators/ArithmeticOperators.r: -------------------------------------------------------------------------------- 1 | # Arithmetic Operators Example 2 | 3 | # Numeric Variables 4 | x <- 10 5 | y <- 3 6 | 7 | # Addition 8 | sum_result <- x + y 9 | 10 | # Subtraction 11 | difference_result <- x - y 12 | 13 | # Multiplication 14 | product_result <- x * y 15 | 16 | # Division 17 | quotient_result <- x / y 18 | 19 | # Exponentiation 20 | exponent_result <- x^y 21 | 22 | # Modulus (Remainder) 23 | remainder_result <- x %% y 24 | 25 | # Integer Division (Quotient) 26 | integer_division_result <- x %/% y 27 | 28 | # Print the results 29 | cat("Addition (x + y):", sum_result, "\n") 30 | cat("Subtraction (x - y):", difference_result, "\n") 31 | cat("Multiplication (x * y):", product_result, "\n") 32 | cat("Division (x / y):", quotient_result, "\n") 33 | cat("Exponentiation (x ^ y):", exponent_result, "\n") 34 | cat("Modulus (x %% y):", remainder_result, "\n") 35 | cat("Integer Division (x %/% y):", integer_division_result, "\n") 36 | -------------------------------------------------------------------------------- /06 Day R Operators/AssignmentOperators.r: -------------------------------------------------------------------------------- 1 | # Assignment Operators Example 2 | 3 | # Numeric Variables 4 | x <- 10 5 | y <- 3 6 | 7 | # Assignment Operator (<-) 8 | sum_result <- x + y 9 | 10 | # Addition Assignment (+=) 11 | x <- x + y 12 | 13 | # Subtraction Assignment (-=) 14 | y <- y - 2 15 | 16 | # Multiplication Assignment (*=) 17 | x <- x * 2 18 | 19 | # Division Assignment (/=) 20 | y <- y / 2 21 | 22 | # Print the results 23 | cat("Using Assignment Operator (<-):\n") 24 | cat("sum_result:", sum_result, "\n") 25 | 26 | cat("\nUsing Addition Assignment (+=):\n") 27 | cat("x:", x, "\n") 28 | 29 | cat("\nUsing Subtraction Assignment (-=):\n") 30 | cat("y:", y, "\n") 31 | 32 | cat("\nUsing Multiplication Assignment (*=):\n") 33 | cat("x:", x, "\n") 34 | 35 | cat("\nUsing Division Assignment (/=):\n") 36 | cat("y:", y, "\n") 37 | -------------------------------------------------------------------------------- /06 Day R Operators/ComparisonOperators.r: -------------------------------------------------------------------------------- 1 | # Comparison Operators Example 2 | 3 | # Numeric Variables 4 | x <- 10 5 | y <- 3 6 | 7 | # Equal to 8 | is_equal <- x == y 9 | 10 | # Not equal to 11 | is_not_equal <- x != y 12 | 13 | # Less than 14 | is_less_than <- x < y 15 | 16 | # Greater than 17 | is_greater_than <- x > y 18 | 19 | # Less than or equal to 20 | is_less_than_or_equal <- x <= y 21 | 22 | # Greater than or equal to 23 | is_greater_than_or_equal <- x >= y 24 | 25 | # Print the results 26 | cat("Equal to (x == y):", is_equal, "\n") 27 | cat("Not equal to (x != y):", is_not_equal, "\n") 28 | cat("Less than (x < y):", is_less_than, "\n") 29 | cat("Greater than (x > y):", is_greater_than, "\n") 30 | cat("Less than or equal to (x <= y):", is_less_than_or_equal, "\n") 31 | cat("Greater than or equal to (x >= y):", is_greater_than_or_equal, "\n") 32 | -------------------------------------------------------------------------------- /06 Day R Operators/LogicalOperators.r: -------------------------------------------------------------------------------- 1 | # Logical Operators Example 2 | 3 | # Logical Variables 4 | is_true <- TRUE 5 | is_false <- FALSE 6 | 7 | # Logical NOT 8 | logical_not <- !is_true 9 | 10 | # Logical AND 11 | logical_and <- is_true & is_false 12 | 13 | # Logical OR 14 | logical_or <- is_true | is_false 15 | 16 | # Short-circuit AND 17 | short_circuit_and <- is_true && is_false 18 | 19 | # Short-circuit OR 20 | short_circuit_or <- is_true || is_false 21 | 22 | # Print the results 23 | cat("Logical NOT (!is_true):", logical_not, "\n") 24 | cat("Logical AND (is_true & is_false):", logical_and, "\n") 25 | cat("Logical OR (is_true | is_false):", logical_or, "\n") 26 | cat("Short-circuit AND (is_true && is_false):", short_circuit_and, "\n") 27 | cat("Short-circuit OR (is_true || is_false):", short_circuit_or, "\n") 28 | -------------------------------------------------------------------------------- /06 Day R Operators/MiscellaneousOperators.r: -------------------------------------------------------------------------------- 1 | # Miscellaneous Operators Example 2 | 3 | # Numeric Vector 4 | numbers <- c(1, 2, 3, 4, 5) 5 | 6 | # %in% Operator 7 | is_present <- 5 %in% numbers 8 | 9 | # : Operator 10 | sequence <- 1:5 11 | 12 | # %*% Operator (Matrix Multiplication) 13 | matrix1 <- matrix(c(1, 2, 3, 4), nrow = 2) 14 | matrix2 <- matrix(c(5, 6, 7, 8), nrow = 2) 15 | matrix_product <- matrix1 %*% matrix2 16 | 17 | # %% Operator (Modulus) 18 | remainder <- 10 %% 3 19 | 20 | # %/% Operator (Integer Division) 21 | integer_quotient <- 10 %/% 3 22 | 23 | # Print the results 24 | cat("Using %in% Operator:\n") 25 | cat("Is 5 present in numbers?", is_present, "\n") 26 | 27 | cat("\nUsing : Operator:\n") 28 | cat("Sequence 1:5:", sequence, "\n") 29 | 30 | cat("\nUsing %*% Operator (Matrix Multiplication):\n") 31 | cat("Matrix 1:\n", matrix1, "\n") 32 | cat("Matrix 2:\n", matrix2, "\n") 33 | cat("Matrix Product:\n", matrix_product, "\n") 34 | 35 | cat("\nUsing %% Operator (Modulus):\n") 36 | cat("Remainder of 10 divided by 3:", remainder, "\n") 37 | 38 | cat("\nUsing %/% Operator (Integer Division):\n") 39 | cat("Integer quotient of 10 divided by 3:", integer_quotient, "\n") 40 | -------------------------------------------------------------------------------- /06 Day R Operators/README.md: -------------------------------------------------------------------------------- 1 | # **Tutorial on R Operators** 2 | 3 | **Website Name**: [www.codeswithpankaj.com](http://www.codeswithpankaj.com) 4 | **Tutorial Name**: Codes With Pankaj 5 | 6 | --- 7 | 8 | ## **Introduction to R Operators** 9 | 10 | Operators in R are symbols or keywords that tell the compiler to perform specific mathematical, logical, or relational operations on operands. Operators form the basis of any programming language, and R is no exception. 11 | 12 | In this tutorial, we will cover the following types of operators in R: 13 | 1. Arithmetic Operators 14 | 2. Relational Operators 15 | 3. Logical Operators 16 | 4. Assignment Operators 17 | 5. Miscellaneous Operators 18 | 19 | Each section will include detailed explanations and examples that are simple and easy to understand for university students. 20 | 21 | --- 22 | 23 | ### **1. Arithmetic Operators** 24 | 25 | Arithmetic operators are used to perform basic mathematical operations such as addition, subtraction, multiplication, and division. 26 | 27 | | Operator | Description | Example | Result | 28 | |----------|---------------------------|-----------------------|--------| 29 | | `+` | Addition | `3 + 4` | `7` | 30 | | `-` | Subtraction | `8 - 2` | `6` | 31 | | `*` | Multiplication | `6 * 7` | `42` | 32 | | `/` | Division | `10 / 2` | `5` | 33 | | `%%` | Modulus | `11 %% 3` | `2` | 34 | | `%/%` | Integer Division | `10 %/% 3` | `3` | 35 | | `^` | Exponentiation | `2 ^ 3` | `8` | 36 | 37 | #### **Example:** 38 | ```r 39 | # Performing arithmetic operations in R 40 | a <- 15 41 | b <- 4 42 | 43 | # Addition 44 | result_add <- a + b 45 | print(paste("Addition:", result_add)) # Output: Addition: 19 46 | 47 | # Subtraction 48 | result_sub <- a - b 49 | print(paste("Subtraction:", result_sub)) # Output: Subtraction: 11 50 | 51 | # Multiplication 52 | result_mul <- a * b 53 | print(paste("Multiplication:", result_mul)) # Output: Multiplication: 60 54 | 55 | # Division 56 | result_div <- a / b 57 | print(paste("Division:", result_div)) # Output: Division: 3.75 58 | 59 | # Modulus 60 | result_mod <- a %% b 61 | print(paste("Modulus:", result_mod)) # Output: Modulus: 3 62 | 63 | # Exponentiation 64 | result_exp <- a ^ b 65 | print(paste("Exponentiation:", result_exp)) # Output: Exponentiation: 50625 66 | ``` 67 | 68 | ### **2. Relational Operators** 69 | 70 | Relational operators are used to compare two values. They return a logical value (`TRUE` or `FALSE`). 71 | 72 | | Operator | Description | Example | Result | 73 | |----------|-----------------------|--------------|---------| 74 | | `==` | Equal to | `5 == 5` | `TRUE` | 75 | | `!=` | Not equal to | `5 != 3` | `TRUE` | 76 | | `>` | Greater than | `7 > 3` | `TRUE` | 77 | | `<` | Less than | `4 < 9` | `TRUE` | 78 | | `>=` | Greater than or equal | `5 >= 5` | `TRUE` | 79 | | `<=` | Less than or equal | `3 <= 2` | `FALSE` | 80 | 81 | #### **Example:** 82 | ```r 83 | # Using relational operators in R 84 | x <- 10 85 | y <- 15 86 | 87 | # Greater than 88 | result_gt <- x > y 89 | print(paste("Is x greater than y?", result_gt)) # Output: Is x greater than y? FALSE 90 | 91 | # Less than 92 | result_lt <- x < y 93 | print(paste("Is x less than y?", result_lt)) # Output: Is x less than y? TRUE 94 | 95 | # Equal to 96 | result_eq <- x == y 97 | print(paste("Is x equal to y?", result_eq)) # Output: Is x equal to y? FALSE 98 | ``` 99 | 100 | ### **3. Logical Operators** 101 | 102 | Logical operators are used to combine or invert logical statements. The result is always a logical value (`TRUE` or `FALSE`). 103 | 104 | | Operator | Description | Example | Result | 105 | |----------|-------------|------------------|--------| 106 | | `&` | AND | `TRUE & FALSE` | `FALSE`| 107 | | `|` | OR | `TRUE | FALSE` | `TRUE` | 108 | | `!` | NOT | `!TRUE` | `FALSE`| 109 | 110 | #### **Example:** 111 | ```r 112 | # Using logical operators in R 113 | p <- TRUE 114 | q <- FALSE 115 | 116 | # Logical AND 117 | result_and <- p & q 118 | print(paste("p AND q:", result_and)) # Output: p AND q: FALSE 119 | 120 | # Logical OR 121 | result_or <- p | q 122 | print(paste("p OR q:", result_or)) # Output: p OR q: TRUE 123 | 124 | # Logical NOT 125 | result_not <- !p 126 | print(paste("NOT p:", result_not)) # Output: NOT p: FALSE 127 | ``` 128 | 129 | ### **4. Assignment Operators** 130 | 131 | Assignment operators are used to assign values to variables. 132 | 133 | | Operator | Description | Example | Equivalent | 134 | |----------|----------------------|--------------|------------| 135 | | `<-` | Assign left | `x <- 5` | `x = 5` | 136 | | `->` | Assign right | `5 -> x` | `x = 5` | 137 | | `<<-` | Global assignment | `x <<- 10` | - | 138 | | `=` | Assign left | `x = 20` | - | 139 | 140 | #### **Example:** 141 | ```r 142 | # Assigning values using assignment operators in R 143 | x <- 25 # Assign 25 to x 144 | y = 30 # Assign 30 to y 145 | 146 | # Print values 147 | print(paste("Value of x:", x)) # Output: Value of x: 25 148 | print(paste("Value of y:", y)) # Output: Value of y: 30 149 | ``` 150 | 151 | ### **5. Miscellaneous Operators** 152 | 153 | R also provides some miscellaneous operators that are used in specific scenarios. 154 | 155 | | Operator | Description | Example | Result | 156 | |----------|--------------------------------------|--------------------------|--------| 157 | | `:` | Sequence operator | `1:5` | `1 2 3 4 5` | 158 | | `%in%` | Element in vector | `3 %in% c(1, 2, 3, 4)` | `TRUE` | 159 | | `%*%` | Matrix multiplication | `matrix(1:4, 2, 2) %*% matrix(1:4, 2, 2)` | - | 160 | 161 | #### **Example:** 162 | ```r 163 | # Using miscellaneous operators in R 164 | 165 | # Sequence operator 166 | sequence <- 1:5 167 | print(paste("Sequence from 1 to 5:", toString(sequence))) # Output: Sequence from 1 to 5: 1, 2, 3, 4, 5 168 | 169 | # Element in vector 170 | is_in_vector <- 3 %in% c(1, 2, 3, 4) 171 | print(paste("Is 3 in the vector?", is_in_vector)) # Output: Is 3 in the vector? TRUE 172 | 173 | # Matrix multiplication 174 | matrix1 <- matrix(1:4, nrow=2, ncol=2) 175 | matrix2 <- matrix(1:4, nrow=2, ncol=2) 176 | result_matrix <- matrix1 %*% matrix2 177 | print("Matrix multiplication result:") 178 | print(result_matrix) 179 | ``` 180 | 181 | --- 182 | 183 | ### **Conclusion** 184 | 185 | This tutorial covered the fundamental R operators with simple examples, ensuring that university students can grasp the concepts with ease. The `Codes With Pankaj` tutorial on [www.codeswithpankaj.com](http://www.codeswithpankaj.com) aims to provide clear and concise explanations that aid in understanding the basics of R programming. 186 | 187 | -------------------------------------------------------------------------------- /07 Day R if...else/Checking_Positive_Negative_Number.r: -------------------------------------------------------------------------------- 1 | # Example 3: Checking Positive or Negative Number 2 | number <- -7 3 | 4 | if (number > 0) { 5 | cat("The number is positive.\n") 6 | } else if (number < 0) { 7 | cat("The number is negative.\n") 8 | } else { 9 | cat("The number is zero.\n") 10 | } 11 | -------------------------------------------------------------------------------- /07 Day R if...else/Example1Simple.r: -------------------------------------------------------------------------------- 1 | # Example 1: Simple if...else 2 | x <- 10 3 | 4 | if (x > 5) { 5 | cat("x is greater than 5.\n") 6 | } else { 7 | cat("x is not greater than 5.\n") 8 | } 9 | -------------------------------------------------------------------------------- /07 Day R if...else/Example2Nested.r: -------------------------------------------------------------------------------- 1 | # Example 2: Nested if...else 2 | y <- 3 3 | 4 | if (y > 5) { 5 | cat("y is greater than 5.\n") 6 | } else if (y == 5) { 7 | cat("y is equal to 5.\n") 8 | } else { 9 | cat("y is less than 5.\n") 10 | } 11 | -------------------------------------------------------------------------------- /07 Day R if...else/GradeClassification.r: -------------------------------------------------------------------------------- 1 | # Example 4: Grade Classification 2 | score <- 85 3 | 4 | if (score >= 90) { 5 | cat("Grade: A\n") 6 | } else if (score >= 80) { 7 | cat("Grade: B\n") 8 | } else if (score >= 70) { 9 | cat("Grade: C\n") 10 | } else if (score >= 60) { 11 | cat("Grade: D\n") 12 | } else { 13 | cat("Grade: F\n") 14 | } 15 | -------------------------------------------------------------------------------- /08 Day R ifelse() Function/Example1.r: -------------------------------------------------------------------------------- 1 | # Example 1: Categorizing Exam Scores 2 | scores <- c(78, 92, 64, 88, 75) 3 | grades <- ifelse(scores >= 90, "A", ifelse(scores >= 80, "B", ifelse(scores >= 70, "C", "D"))) 4 | 5 | # Print the result 6 | cat("Grades:", grades, "\n") 7 | -------------------------------------------------------------------------------- /08 Day R ifelse() Function/Example2.r: -------------------------------------------------------------------------------- 1 | # Example 2: Checking for Missing Values 2 | data <- c(15, NA, 42, NA, 58) 3 | is_missing <- ifelse(is.na(data), "Missing", "Not Missing") 4 | 5 | # Print the result 6 | cat("Missing Status:", is_missing, "\n") 7 | -------------------------------------------------------------------------------- /08 Day R ifelse() Function/Example3.r: -------------------------------------------------------------------------------- 1 | # Example 3: Categorizing Ages 2 | ages <- c(25, 35, 42, 18, 60) 3 | age_category <- ifelse(ages < 18, "Child", ifelse(ages < 65, "Adult", "Senior")) 4 | 5 | # Print the result 6 | cat("Age Categories:", age_category, "\n") 7 | -------------------------------------------------------------------------------- /08 Day R ifelse() Function/Example4.r: -------------------------------------------------------------------------------- 1 | # Example 4: Calculating Discounts 2 | product_prices <- c(25, 50, 75, 100) 3 | discounted_prices <- ifelse(product_prices >= 50, product_prices * 0.9, product_prices) 4 | 5 | # Print the result 6 | cat("Discounted Prices:", discounted_prices, "\n") 7 | -------------------------------------------------------------------------------- /08 Day R ifelse() Function/README.md: -------------------------------------------------------------------------------- 1 | # R ifelse() Function Tutorial 2 | 3 | Welcome to www.codeswithpankaj.com! In this tutorial, we will explore the `ifelse()` function in R. The `ifelse()` function is used for vectorized conditional evaluation, making it efficient for applying conditions to entire vectors or arrays. 4 | 5 | ## What is the ifelse() Function? 6 | 7 | The `ifelse()` function evaluates a condition for each element in a vector or array and returns a value based on whether the condition is `TRUE` or `FALSE`. It provides a concise way to perform conditional operations on data structures. 8 | 9 | ### Syntax of ifelse() Function 10 | 11 | The basic syntax of the `ifelse()` function is: 12 | 13 | ```R 14 | ifelse(test, yes, no) 15 | ``` 16 | 17 | - `test`: A logical vector or scalar indicating whether the condition is true or false. 18 | - `yes`: Value to be returned if `test` is `TRUE`. 19 | - `no`: Value to be returned if `test` is `FALSE`. 20 | 21 | ### Example: Using ifelse() Function 22 | 23 | Let's see a simple example to understand how `ifelse()` works: 24 | 25 | ```R 26 | # Example of ifelse() function 27 | x <- c(1, 5, -3, 7, -2) 28 | 29 | result <- ifelse(x > 0, "Positive", "Negative or Zero") 30 | print(result) 31 | ``` 32 | 33 | In this example: 34 | - `x` is a numeric vector containing values `1, 5, -3, 7, -2`. 35 | - `ifelse(x > 0, "Positive", "Negative or Zero")` checks each element of `x`. 36 | - If an element of `x` is greater than `0`, `"Positive"` is returned; otherwise, `"Negative or Zero"` is returned. 37 | 38 | ### Nested ifelse() Statements 39 | 40 | You can nest `ifelse()` statements to handle more complex conditions: 41 | 42 | ```R 43 | # Nested ifelse() example 44 | marks <- c(85, 70, 60, 95, 80) 45 | 46 | result <- ifelse(marks >= 90, "A", 47 | ifelse(marks >= 80, "B", 48 | ifelse(marks >= 70, "C", "D"))) 49 | 50 | print(result) 51 | ``` 52 | 53 | In this example: 54 | - `marks` is a vector of numeric values. 55 | - The nested `ifelse()` statements check the value of `marks` against multiple conditions (`>= 90`, `>= 80`, `>= 70`) to assign grades `"A"`, `"B"`, `"C"`, or `"D"`. 56 | 57 | ### Using ifelse() with Functions 58 | 59 | You can integrate `ifelse()` within functions to perform conditional operations based on function parameters or data: 60 | 61 | ```R 62 | # Using ifelse() in a function 63 | grade <- function(score) { 64 | return(ifelse(score >= 90, "A", 65 | ifelse(score >= 80, "B", 66 | ifelse(score >= 70, "C", "D")))) 67 | } 68 | 69 | # Test the function 70 | print(grade(85)) # Output: "B" 71 | print(grade(55)) # Output: "D" 72 | ``` 73 | 74 | ### Handling Missing Values with ifelse() 75 | 76 | The `ifelse()` function can handle missing values (`NA`) by specifying how to treat them in the `yes` or `no` arguments: 77 | 78 | ```R 79 | # Handling NA values with ifelse() 80 | values <- c(10, NA, 20, 30, NA) 81 | 82 | result <- ifelse(is.na(values), "Missing", values * 2) 83 | print(result) 84 | ``` 85 | 86 | In this example: 87 | - `values` is a vector containing numeric values and `NA`. 88 | - `ifelse(is.na(values), "Missing", values * 2)` doubles each non-NA value and replaces NA values with `"Missing"`. 89 | 90 | ## Conclusion 91 | 92 | In this tutorial, we explored the `ifelse()` function in R, a powerful tool for performing vectorized conditional operations. We covered its syntax, usage in simple and nested conditions, integration within functions, and handling of missing values (`NA`). Mastering `ifelse()` allows for efficient and concise data manipulation based on dynamic conditions. 93 | 94 | For more tutorials and guides, visit [www.codeswithpankaj.com](http://www.codeswithpankaj.com). 95 | 96 | --- 97 | 98 | Happy coding! 99 | -------------------------------------------------------------------------------- /09 Day R while Loop/Example1.r: -------------------------------------------------------------------------------- 1 | # Example 1: Counting from 1 to 10 using a while loop 2 | count <- 1 3 | 4 | while (count <= 10) { 5 | cat("Count:", count, "\n") 6 | count <- count + 1 7 | } 8 | -------------------------------------------------------------------------------- /09 Day R while Loop/Example2.r: -------------------------------------------------------------------------------- 1 | # Example 2: Calculating the sum of numbers from 1 to 100 using a while loop 2 | n <- 1 3 | sum_numbers <- 0 4 | 5 | while (n <= 100) { 6 | sum_numbers <- sum_numbers + n 7 | n <- n + 1 8 | } 9 | 10 | cat("Sum of numbers from 1 to 100:", sum_numbers, "\n") 11 | -------------------------------------------------------------------------------- /09 Day R while Loop/Example3.r: -------------------------------------------------------------------------------- 1 | # Example 3: Countdown from 10 to 1 using a while loop 2 | countdown <- 10 3 | 4 | while (countdown >= 1) { 5 | cat("Countdown:", countdown, "\n") 6 | countdown <- countdown - 1 7 | } 8 | -------------------------------------------------------------------------------- /09 Day R while Loop/Example4.r: -------------------------------------------------------------------------------- 1 | # Example 4: User input validation using a while loop 2 | user_input <- -1 3 | 4 | while (user_input < 0) { 5 | user_input <- as.numeric(readline("Enter a positive number: ")) 6 | if (user_input < 0) { 7 | cat("Invalid input. Please enter a positive number.\n") 8 | } 9 | } 10 | 11 | cat("You entered:", user_input, "\n") 12 | -------------------------------------------------------------------------------- /09 Day R while Loop/README.md: -------------------------------------------------------------------------------- 1 | 2 | # **Tutorial on `while` Loop in R Programming** 3 | 4 | **Website Name**: [www.codeswithpankaj.com](http://www.codeswithpankaj.com) 5 | **Tutorial Name**: Codes With Pankaj 6 | 7 | --- 8 | 9 | ## **Table of Contents** 10 | 11 | 1. [Introduction to `while` Loop](#1-introduction-to-while-loop) 12 | 2. [Syntax of `while` Loop](#2-syntax-of-while-loop) 13 | 3. [Basic Example of `while` Loop](#3-basic-example-of-while-loop) 14 | 4. [Infinite `while` Loop](#4-infinite-while-loop) 15 | 5. [Breaking a `while` Loop](#5-breaking-a-while-loop) 16 | 6. [Using `next` Statement in a `while` Loop](#6-using-next-statement-in-a-while-loop) 17 | 7. [Nested `while` Loops](#7-nested-while-loops) 18 | 8. [Common Mistakes and Best Practices](#8-common-mistakes-and-best-practices) 19 | 9. [Summary and Conclusion](#9-summary-and-conclusion) 20 | 21 | --- 22 | 23 | ## **1. Introduction to `while` Loop** 24 | 25 | The `while` loop in R is a control flow statement that allows you to repeatedly execute a block of code as long as a specified condition is `TRUE`. The loop continues to run until the condition becomes `FALSE`. 26 | 27 | ### **Why Use `while` Loops?** 28 | 29 | - **Repetitive Tasks:** Ideal for tasks that need repetition until a condition is met. 30 | - **Dynamic Iterations:** Useful when the number of iterations is not predetermined and depends on runtime conditions. 31 | 32 | --- 33 | 34 | ## **2. Syntax of `while` Loop** 35 | 36 | The syntax of the `while` loop is simple and easy to grasp. 37 | 38 | ### **Syntax:** 39 | 40 | ```r 41 | while (condition) { 42 | # Code to execute while the condition is TRUE 43 | } 44 | ``` 45 | 46 | - **condition**: A logical expression that returns `TRUE` or `FALSE`. 47 | - **Code block**: The code inside the braces `{}` will execute repeatedly as long as the condition evaluates to `TRUE`. 48 | 49 | --- 50 | 51 | ## **3. Basic Example of `while` Loop** 52 | 53 | Here’s a basic example to demonstrate the use of a `while` loop in R. 54 | 55 | ### **Example:** 56 | 57 | ```r 58 | # Initialize a counter variable 59 | counter <- 1 60 | 61 | # While loop to print numbers from 1 to 5 62 | while (counter <= 5) { 63 | print(paste("Counter:", counter)) 64 | counter <- counter + 1 # Increment the counter 65 | } 66 | ``` 67 | 68 | **Explanation:** 69 | 70 | - The variable `counter` is initialized to `1`. 71 | - The loop checks if `counter` is less than or equal to `5`. 72 | - If `TRUE`, it prints the value of `counter` and then increments `counter` by `1`. 73 | - The loop repeats until `counter` exceeds `5`. 74 | 75 | **Output:** 76 | ``` 77 | [1] "Counter: 1" 78 | [1] "Counter: 2" 79 | [1] "Counter: 3" 80 | [1] "Counter: 4" 81 | [1] "Counter: 5" 82 | ``` 83 | 84 | --- 85 | 86 | ## **4. Infinite `while` Loop** 87 | 88 | An infinite `while` loop occurs when the condition never evaluates to `FALSE`, causing the loop to run indefinitely. 89 | 90 | ### **Example:** 91 | 92 | ```r 93 | # Infinite while loop example 94 | counter <- 1 95 | 96 | while (counter > 0) { 97 | print(paste("Counter:", counter)) 98 | counter <- counter + 1 # Increment the counter 99 | } 100 | ``` 101 | 102 | **Explanation:** 103 | 104 | - Here, the condition `counter > 0` is always `TRUE` because `counter` is incremented continuously. 105 | - The loop will run indefinitely, printing the value of `counter` without stopping. 106 | 107 | **Caution:** 108 | Infinite loops can cause your program to hang or crash. Always ensure that your loop has a way to terminate. 109 | 110 | --- 111 | 112 | ## **5. Breaking a `while` Loop** 113 | 114 | You may want to exit a `while` loop before the condition becomes `FALSE`. The `break` statement allows you to do this. 115 | 116 | ### **Example:** 117 | 118 | ```r 119 | # Breaking out of a while loop 120 | counter <- 1 121 | 122 | while (counter <= 10) { 123 | print(paste("Counter:", counter)) 124 | 125 | if (counter == 5) { 126 | print("Breaking the loop") 127 | break # Exit the loop when counter equals 5 128 | } 129 | 130 | counter <- counter + 1 131 | } 132 | ``` 133 | 134 | **Explanation:** 135 | 136 | - The loop prints numbers from `1` to `5`. 137 | - When `counter` equals `5`, the `break` statement is executed, terminating the loop. 138 | 139 | **Output:** 140 | ``` 141 | [1] "Counter: 1" 142 | [1] "Counter: 2" 143 | [1] "Counter: 3" 144 | [1] "Counter: 4" 145 | [1] "Counter: 5" 146 | [1] "Breaking the loop" 147 | ``` 148 | 149 | --- 150 | 151 | ## **6. Using `next` Statement in a `while` Loop** 152 | 153 | The `next` statement is used to skip the current iteration of a loop and move to the next iteration. 154 | 155 | ### **Example:** 156 | 157 | ```r 158 | # Using next statement in a while loop 159 | counter <- 1 160 | 161 | while (counter <= 5) { 162 | 163 | if (counter == 3) { 164 | counter <- counter + 1 165 | next # Skip the rest of the loop when counter equals 3 166 | } 167 | 168 | print(paste("Counter:", counter)) 169 | counter <- counter + 1 170 | } 171 | ``` 172 | 173 | **Explanation:** 174 | 175 | - The loop prints numbers from `1` to `5` but skips `3` because of the `next` statement. 176 | - When `counter` equals `3`, the `next` statement skips the `print` statement for that iteration. 177 | 178 | **Output:** 179 | ``` 180 | [1] "Counter: 1" 181 | [1] "Counter: 2" 182 | [1] "Counter: 4" 183 | [1] "Counter: 5" 184 | ``` 185 | 186 | --- 187 | 188 | ## **7. Nested `while` Loops** 189 | 190 | You can nest one `while` loop inside another to perform more complex iterations. 191 | 192 | ### **Example:** 193 | 194 | ```r 195 | # Nested while loops example 196 | outer_counter <- 1 197 | 198 | while (outer_counter <= 3) { 199 | 200 | inner_counter <- 1 201 | 202 | while (inner_counter <= 2) { 203 | print(paste("Outer Counter:", outer_counter, "Inner Counter:", inner_counter)) 204 | inner_counter <- inner_counter + 1 205 | } 206 | 207 | outer_counter <- outer_counter + 1 208 | } 209 | ``` 210 | 211 | **Explanation:** 212 | 213 | - The outer loop runs from `1` to `3`. 214 | - The inner loop runs from `1` to `2` for each iteration of the outer loop. 215 | - This results in combinations of outer and inner loop values being printed. 216 | 217 | **Output:** 218 | ```yaml 219 | [1] "Outer Counter: 1 Inner Counter: 1" 220 | [1] "Outer Counter: 1 Inner Counter: 2" 221 | [1] "Outer Counter: 2 Inner Counter: 1" 222 | [1] "Outer Counter: 2 Inner Counter: 2" 223 | [1] "Outer Counter: 3 Inner Counter: 1" 224 | [1] "Outer Counter: 3 Inner Counter: 2" 225 | ``` 226 | 227 | ## **8. Common Mistakes and Best Practices** 228 | 229 | ### **Common Mistakes** 230 | 231 | 1. **Infinite Loops:** 232 | - **Cause:** Not updating the loop condition. 233 | - **Example:** 234 | ```r 235 | counter <- 1 236 | 237 | while (counter <= 5) { 238 | print(counter) 239 | # counter is not updated, causing an infinite loop 240 | } 241 | ``` 242 | 243 | 2. **Incorrect Condition:** 244 | - **Cause:** Using a condition that never becomes `TRUE`. 245 | - **Example:** 246 | ```r 247 | counter <- 1 248 | 249 | while (counter > 5) { 250 | print(counter) 251 | counter <- counter + 1 252 | } 253 | ``` 254 | - **Solution:** Ensure that the initial value and condition are logically consistent. 255 | 256 | ### **Best Practices** 257 | 258 | - **Always Update the Condition:** Ensure the loop condition is updated inside the loop to avoid infinite loops. 259 | - **Use `break` Wisely:** The `break` statement should be used when you have a clear condition to exit the loop early. 260 | - **Use `next` to Skip Iterations:** Use `next` to skip specific iterations without terminating the loop. 261 | - **Test Edge Cases:** Test your `while` loop with edge cases to ensure it behaves as expected. 262 | 263 | ## **9. Summary and Conclusion** 264 | 265 | This tutorial covered the essential aspects of the `while` loop in R programming. We explored its syntax, basic examples, potential pitfalls like infinite loops, and best practices for using `while` loops effectively. 266 | 267 | ### **Key Takeaways:** 268 | - **Repeating Tasks:** The `while` loop is useful for repeating tasks until a condition is met. 269 | - **Control Flow:** It provides a flexible control flow structure, especially when the number of iterations isn't known beforehand. 270 | - **Avoid Infinite Loops:** Always ensure your loop has a clear condition to terminate. 271 | 272 | The `Codes With Pankaj` tutorial on [www.codeswithpankaj.com](http://www.codeswithpankaj.com) 273 | -------------------------------------------------------------------------------- /10 Day R for Loop/Example1.r: -------------------------------------------------------------------------------- 1 | # Example 1: Counting from 1 to 10 using a for loop 2 | for (i in 1:10) { 3 | cat("Count:", i, "\n") 4 | } 5 | -------------------------------------------------------------------------------- /10 Day R for Loop/Example2.r: -------------------------------------------------------------------------------- 1 | # Example 2: Calculating the sum of numbers from 1 to 100 using a for loop 2 | sum_numbers <- 0 3 | 4 | for (i in 1:100) { 5 | sum_numbers <- sum_numbers + i 6 | } 7 | 8 | cat("Sum of numbers from 1 to 100:", sum_numbers, "\n") 9 | -------------------------------------------------------------------------------- /10 Day R for Loop/Example3.r: -------------------------------------------------------------------------------- 1 | # Example 3: Using a for loop to iterate through elements in a vector 2 | fruits <- c("apple", "banana", "cherry", "date") 3 | 4 | for (fruit in fruits) { 5 | cat("Fruit:", fruit, "\n") 6 | } 7 | -------------------------------------------------------------------------------- /10 Day R for Loop/Example4.r: -------------------------------------------------------------------------------- 1 | # Example 4: Generating the Fibonacci series using a for loop 2 | n <- 10 3 | fibonacci <- numeric(n) 4 | fibonacci[1] <- 0 5 | fibonacci[2] <- 1 6 | 7 | for (i in 3:n) { 8 | fibonacci[i] <- fibonacci[i - 1] + fibonacci[i - 2] 9 | } 10 | 11 | cat("Fibonacci Series:", fibonacci, "\n") 12 | -------------------------------------------------------------------------------- /11 Day R repeat Loop/ReversingString.r: -------------------------------------------------------------------------------- 1 | # Example 1: Using a repeat loop to reverse a string 2 | input_string <- "Hello, World!" 3 | output_string <- "" 4 | index <- nchar(input_string) 5 | 6 | repeat { 7 | if (index == 0) { 8 | break # Terminate the loop when the entire string is reversed 9 | } 10 | output_string <- paste(output_string, substr(input_string, index, index), sep = "") 11 | index <- index - 1 12 | } 13 | 14 | cat("Reversed String:", output_string, "\n") 15 | -------------------------------------------------------------------------------- /11 Day R repeat Loop/UserInputValidation.r: -------------------------------------------------------------------------------- 1 | # Example 2: Using a repeat loop for user input validation 2 | while (TRUE) { 3 | user_input <- as.numeric(readline("Enter a positive number: ")) 4 | 5 | if (!is.na(user_input) && user_input > 0) { 6 | cat("You entered a valid positive number:", user_input, "\n") 7 | break # Terminate the loop when valid input is provided 8 | } else { 9 | cat("Invalid input. Please enter a positive number.\n") 10 | } 11 | } 12 | -------------------------------------------------------------------------------- /12 Day R break and next/Example1.r: -------------------------------------------------------------------------------- 1 | # Example 1: Using break in a for loop to exit when a condition is met 2 | for (i in 1:10) { 3 | if (i == 5) { 4 | break # Exit the loop when i equals 5 5 | } 6 | cat("Value:", i, "\n") 7 | } 8 | -------------------------------------------------------------------------------- /12 Day R break and next/Example2.r: -------------------------------------------------------------------------------- 1 | # Example 2: Using next in a for loop to skip even numbers 2 | for (i in 1:5) { 3 | if (i %% 2 == 0) { 4 | next # Skip even numbers 5 | } 6 | cat("Value:", i, "\n") 7 | } 8 | -------------------------------------------------------------------------------- /12 Day R break and next/Example3.r: -------------------------------------------------------------------------------- 1 | # Example 3: Using break in a while loop to exit when a condition is met 2 | x <- 1 3 | 4 | while (TRUE) { 5 | cat("Value:", x, "\n") 6 | if (x >= 5) { 7 | break # Exit the loop when x is greater than or equal to 5 8 | } 9 | x <- x + 1 10 | } 11 | -------------------------------------------------------------------------------- /13 Day R Functions /Closure Function.r: -------------------------------------------------------------------------------- 1 | # Example of a closure function 2 | make_multiplier <- function(factor) { 3 | function(x) { 4 | return(x * factor) 5 | } 6 | } 7 | 8 | multiply_by_2 <- make_multiplier(2) 9 | multiply_by_3 <- make_multiplier(3) 10 | 11 | result1 <- multiply_by_2(5) 12 | result2 <- multiply_by_3(4) 13 | 14 | cat("Result1:", result1, "\n") 15 | cat("Result2:", result2, "\n") 16 | -------------------------------------------------------------------------------- /13 Day R Functions /Generator Function.r: -------------------------------------------------------------------------------- 1 | # Example of a generator function 2 | library(generator) 3 | 4 | fibonacci_generator <- function() { 5 | a <- 0 6 | b <- 1 7 | while (TRUE) { 8 | yield(a) 9 | c <- a + b 10 | a <- b 11 | b <- c 12 | } 13 | } 14 | 15 | fibonacci_sequence <- generator(fibonacci_generator) 16 | 17 | # Generate the first 10 Fibonacci numbers 18 | first_10_fibonacci <- head(fibonacci_sequence, 10) 19 | cat("First 10 Fibonacci numbers:", first_10_fibonacci, "\n") 20 | -------------------------------------------------------------------------------- /13 Day R Functions /Mapping and Reducing Functions.r: -------------------------------------------------------------------------------- 1 | # Example of mapping and reducing functions 2 | library(purrr) 3 | 4 | # Square each element in a list 5 | numbers <- list(1, 2, 3, 4, 5) 6 | squared_numbers <- map(numbers, ~ .x^2) 7 | cat("Squared numbers:", unlist(squared_numbers), "\n") 8 | 9 | # Calculate the product of all elements in a vector 10 | product_result <- reduce(c(1, 2, 3, 4, 5), `*`) 11 | cat("Product of numbers:", product_result, "\n") 12 | -------------------------------------------------------------------------------- /13 Day R Functions /README.md: -------------------------------------------------------------------------------- 1 | # R Functions 2 | 3 | In R, functions are blocks of code that can be defined and reused to perform specific tasks or operations. Functions encapsulate a series of statements, accept input (arguments), and often return output values. Functions are a fundamental concept in R programming and are essential for modularizing code and making it more organized and reusable. 4 | 5 | Here's the basic structure of defining and using a function in R: 6 | 7 | ```R 8 | # Function definition 9 | function_name <- function(arg1, arg2, ...) { 10 | # Function body: code to perform a specific task 11 | # You can use the arguments (arg1, arg2, ...) within the function 12 | 13 | # Return a value (optional) 14 | return(result) 15 | } 16 | 17 | # Function call 18 | output <- function_name(arg1_value, arg2_value, ...) 19 | ``` 20 | 21 | Key components of a function: 22 | 23 | - `function_name`: The name of the function you define. 24 | - `arg1`, `arg2`, ...: Arguments or parameters that the function accepts. You can have zero or more arguments. 25 | - `result`: The value the function returns (optional). 26 | - `arg1_value`, `arg2_value`, ...: Actual values or expressions provided when calling the function. 27 | 28 | Here's a simple example of a function that adds two numbers and returns the result: 29 | 30 | ```R 31 | # Define a function to add two numbers 32 | add_numbers <- function(x, y) { 33 | result <- x + y 34 | return(result) 35 | } 36 | 37 | # Call the function 38 | sum_result <- add_numbers(5, 3) 39 | cat("Sum:", sum_result, "\n") 40 | ``` 41 | 42 | In this example: 43 | 44 | - `add_numbers` is the function name. 45 | - `x` and `y` are the function's arguments. 46 | - Inside the function, `result` is calculated as the sum of `x` and `y`. 47 | - The `return(result)` statement returns the result. 48 | - We call the function with values 5 and 3 and store the result in `sum_result`. 49 | 50 | Functions can be much more complex and perform various operations, including data analysis, data manipulation, and generating outputs. They are a fundamental building block in writing structured and maintainable R code. 51 | 52 | ## Examples of different types of functions in R: 53 | 54 | **1. Built-in Functions:** 55 | 56 | ```R 57 | # Example of built-in functions 58 | # Calculate the sum of a vector 59 | numbers <- c(5, 10, 15, 20) 60 | sum_result <- sum(numbers) 61 | cat("Sum of numbers:", sum_result, "\n") 62 | 63 | # Calculate the mean of a vector 64 | mean_result <- mean(numbers) 65 | cat("Mean of numbers:", mean_result, "\n") 66 | 67 | # Find the length of a vector 68 | length_result <- length(numbers) 69 | cat("Length of vector:", length_result, "\n") 70 | ``` 71 | 72 | In this example, we use built-in functions `sum()`, `mean()`, and `length()` to perform basic operations on a numeric vector. 73 | 74 | **2. User-Defined Function:** 75 | 76 | ```R 77 | # Example of a user-defined function 78 | # Define a function to calculate the area of a rectangle 79 | calculate_rectangle_area <- function(length, width) { 80 | area <- length * width 81 | return(area) 82 | } 83 | 84 | # Call the user-defined function 85 | rectangle_area <- calculate_rectangle_area(4, 6) 86 | cat("Area of rectangle:", rectangle_area, "\n") 87 | ``` 88 | 89 | Here, we define a user-defined function `calculate_rectangle_area()` to calculate the area of a rectangle based on its length and width. 90 | 91 | **3. Anonymous Function (Lambda Function):** 92 | 93 | ```R 94 | # Example of an anonymous function 95 | # Use lapply to square each element in a vector 96 | numbers <- c(1, 2, 3, 4, 5) 97 | squared_numbers <- lapply(numbers, function(x) x^2) 98 | cat("Squared numbers:", unlist(squared_numbers), "\n") 99 | ``` 100 | 101 | In this example, we use an anonymous function within the `lapply()` function to square each element in a vector. 102 | 103 | **4. Higher-Order Function:** 104 | 105 | ```R 106 | # Example of a higher-order function 107 | # Use sapply to apply a function to each element in a list 108 | fruits_list <- list("apple", "banana", "cherry") 109 | lengths <- sapply(fruits_list, length) 110 | cat("Lengths of fruits:", lengths, "\n") 111 | ``` 112 | 113 | Here, we use the higher-order function `sapply()` to apply the `length()` function to each element in a list of fruits. 114 | 115 | **5. Special-Purpose Function:** 116 | 117 | ```R 118 | # Example of a special-purpose function 119 | # Using dplyr's filter function to filter data 120 | library(dplyr) 121 | 122 | # Sample data frame 123 | data <- data.frame(Name = c("Alice", "Bob", "Charlie", "David"), 124 | Age = c(25, 30, 22, 35)) 125 | 126 | # Filter rows where Age is greater than 30 127 | filtered_data <- filter(data, Age > 30) 128 | print(filtered_data) 129 | ``` 130 | 131 | Here, we use the `filter()` function from the `dplyr` package to filter rows based on a condition in a data frame. 132 | 133 | These examples illustrate different types of functions in R, including built-in functions, user-defined functions, anonymous functions, higher-order functions, and special-purpose functions. -------------------------------------------------------------------------------- /13 Day R Functions /Recursive Function.r: -------------------------------------------------------------------------------- 1 | # Example of a recursive function to calculate factorial 2 | calculate_factorial <- function(n) { 3 | if (n == 0 || n == 1) { 4 | return(1) 5 | } else { 6 | return(n * calculate_factorial(n - 1)) 7 | } 8 | } 9 | 10 | # Calculate factorial of 5 11 | factorial_of_5 <- calculate_factorial(5) 12 | cat("Factorial of 5:", factorial_of_5, "\n") 13 | -------------------------------------------------------------------------------- /15 Day R Strings /README.md: -------------------------------------------------------------------------------- 1 | # R String Methods 2 | 3 | **Introduction:** 4 | In R, strings are sequences of characters used to represent text data. Manipulating strings is essential in data cleaning, text analysis, and data visualization. This tutorial covers the most commonly used string manipulation methods in R, with examples to illustrate each one. 5 | 6 | #### Prerequisites 7 | - Basic knowledge of R programming. 8 | - R installed on your system or access to RStudio. 9 | 10 | ### 1. **Creating Strings in R** 11 | 12 | Strings in R are created using double or single quotes. The `c()` function can create a vector of strings. 13 | 14 | ```r 15 | # Single string 16 | single_string <- "Hello, World!" 17 | print(single_string) 18 | 19 | # Vector of strings 20 | string_vector <- c("apple", "banana", "cherry") 21 | print(string_vector) 22 | ``` 23 | 24 | ### 2. **Basic String Functions** 25 | 26 | #### `nchar()`: Get String Length 27 | The `nchar()` function returns the number of characters in a string. 28 | 29 | ```r 30 | text <- "R Programming" 31 | nchar(text) 32 | # Output: 13 33 | ``` 34 | 35 | #### `tolower()` and `toupper()`: Convert Case 36 | These functions convert strings to lowercase or uppercase. 37 | 38 | ```r 39 | text <- "Hello, R!" 40 | tolower(text) # Output: "hello, r!" 41 | toupper(text) # Output: "HELLO, R!" 42 | ``` 43 | 44 | ### 3. **Substring and Character Extraction** 45 | 46 | #### `substr()`: Extract or Replace Substring 47 | The `substr()` function extracts a part of a string based on a specified range. It can also replace a substring if used on the left side of an assignment. 48 | 49 | ```r 50 | text <- "Data Science" 51 | substr(text, 1, 4) # Extracts "Data" 52 | 53 | # Replace substring 54 | substr(text, 6, 7) <- "Scient" 55 | print(text) # Output: "Data Scient" 56 | ``` 57 | 58 | #### `substring()`: Extract Substring from Start Position 59 | `substring()` is similar to `substr()` but allows extraction from a specific start position until the end. 60 | 61 | ```r 62 | text <- "Machine Learning" 63 | substring(text, 9) # Extracts "Learning" 64 | ``` 65 | 66 | ### 4. **String Concatenation** 67 | 68 | #### `paste()` and `paste0()`: Concatenate Strings 69 | The `paste()` function joins strings with a specified separator, while `paste0()` joins strings without any separator. 70 | 71 | ```r 72 | # Using paste 73 | first <- "Data" 74 | second <- "Science" 75 | paste(first, second) # Output: "Data Science" 76 | paste(first, second, sep="-") # Output: "Data-Science" 77 | 78 | # Using paste0 79 | paste0(first, second) # Output: "DataScience" 80 | ``` 81 | 82 | ### 5. **Finding Patterns in Strings** 83 | 84 | #### `grep()`: Search for Pattern 85 | The `grep()` function searches for patterns within a string or vector of strings and returns the indices of matches. 86 | 87 | ```r 88 | text_vector <- c("apple", "banana", "cherry") 89 | grep("a", text_vector) # Output: 1 2 (positions where "a" is found) 90 | ``` 91 | 92 | #### `grepl()`: Logical Search for Pattern 93 | `grepl()` returns a logical vector indicating if a pattern is found in each element of a string vector. 94 | 95 | ```r 96 | grepl("a", text_vector) # Output: TRUE TRUE FALSE 97 | ``` 98 | 99 | ### 6. **Replacing Patterns** 100 | 101 | #### `gsub()`: Replace All Occurrences 102 | The `gsub()` function replaces all instances of a pattern within a string with a replacement string. 103 | 104 | ```r 105 | text <- "I love apples" 106 | gsub("apples", "oranges", text) # Output: "I love oranges" 107 | ``` 108 | 109 | #### `sub()`: Replace First Occurrence 110 | The `sub()` function replaces only the first occurrence of a pattern. 111 | 112 | ```r 113 | text <- "banana apple banana" 114 | sub("banana", "grape", text) # Output: "grape apple banana" 115 | ``` 116 | 117 | ### 7. **Splitting Strings** 118 | 119 | #### `strsplit()`: Split Strings into a List 120 | The `strsplit()` function splits a string based on a specified delimiter and returns a list. 121 | 122 | ```r 123 | text <- "apple,banana,cherry" 124 | strsplit(text, ",") # Output: list("apple", "banana", "cherry") 125 | ``` 126 | 127 | ### 8. **String Matching and Extraction with Regular Expressions** 128 | 129 | #### `regexpr()` and `gregexpr()`: Find Pattern Position 130 | These functions return the starting positions of the first or all occurrences of a pattern in a string. 131 | 132 | ```r 133 | text <- "Data Science with R" 134 | regexpr("R", text) # Finds first "R" position 135 | gregexpr("e", text) # Finds all "e" positions 136 | ``` 137 | 138 | #### `regmatches()`: Extract or Replace Matched Substrings 139 | Used with `regexpr()` or `gregexpr()` results to extract or replace matched substrings. 140 | 141 | ```r 142 | matches <- gregexpr("a", text) 143 | regmatches(text, matches) # Output: list("a", "a") 144 | ``` 145 | 146 | ### 9. **Advanced String Manipulation with `stringr` Package** 147 | 148 | The `stringr` package provides additional functions and is useful for complex string manipulation. If not already installed, install it with `install.packages("stringr")`. 149 | 150 | #### `str_detect()`: Detect Pattern 151 | `str_detect()` checks if a pattern exists within a string vector. 152 | 153 | ```r 154 | library(stringr) 155 | str_detect(text_vector, "a") # Output: TRUE TRUE FALSE 156 | ``` 157 | 158 | #### `str_replace()`: Replace Pattern 159 | Similar to `gsub()`, `str_replace()` replaces all occurrences of a pattern in a string. 160 | 161 | ```r 162 | str_replace(text, "apples", "oranges") 163 | ``` 164 | 165 | ### Summary Table of R String Methods 166 | 167 | | Function | Description | Example | 168 | |-----------------|-----------------------------------------|----------------------------------------------| 169 | | `nchar()` | Get string length | `nchar("text")` | 170 | | `tolower()` | Convert to lowercase | `tolower("TEXT")` | 171 | | `toupper()` | Convert to uppercase | `toupper("text")` | 172 | | `substr()` | Extract substring | `substr("Data", 1, 2)` | 173 | | `paste()` | Concatenate with separator | `paste("A", "B", sep="-")` | 174 | | `paste0()` | Concatenate without separator | `paste0("A", "B")` | 175 | | `grep()` | Search for pattern | `grep("a", c("apple", "banana"))` | 176 | | `gsub()` | Replace all occurrences | `gsub("apple", "orange", "I like apple")` | 177 | | `strsplit()` | Split string into list | `strsplit("a,b,c", ",")` | 178 | | `regexpr()` | Get position of first match | `regexpr("a", "apple")` | 179 | | `gregexpr()` | Get positions of all matches | `gregexpr("a", "banana")` | 180 | | `regmatches()` | Extract matches from positions | `regmatches("banana", gregexpr("a", "banana"))` | 181 | 182 | This guide provides a comprehensive overview of R string methods, covering everything from basic manipulations to advanced pattern matching and replacement. Use these functions to work with text data efficiently in R. 183 | -------------------------------------------------------------------------------- /16 Day R Vectors/README.md: -------------------------------------------------------------------------------- 1 | # R Vectors in Detail 2 | 3 | **Introduction:** 4 | Vectors are one of the basic data structures in R and are fundamental to handling and processing data. A vector is a sequence of data elements of the same type, such as numbers, characters, or logical values. This tutorial explores creating, manipulating, and applying operations on vectors in R. 5 | 6 | ### 1. **Creating Vectors in R** 7 | 8 | Vectors in R can be created using the `c()` function, which combines values into a vector. 9 | 10 | ```r 11 | # Numeric vector 12 | numeric_vector <- c(1, 2, 3, 4, 5) 13 | print(numeric_vector) 14 | 15 | # Character vector 16 | character_vector <- c("apple", "banana", "cherry") 17 | print(character_vector) 18 | 19 | # Logical vector 20 | logical_vector <- c(TRUE, FALSE, TRUE) 21 | print(logical_vector) 22 | ``` 23 | 24 | ### 2. **Types of Vectors** 25 | 26 | R vectors can be of different types: 27 | - **Numeric**: Contains decimal or integer values. 28 | - **Character**: Holds text or string data. 29 | - **Logical**: Contains TRUE or FALSE values. 30 | - **Integer**: Stores whole numbers, created with `L` suffix. 31 | - **Complex**: Holds complex numbers. 32 | 33 | ```r 34 | integer_vector <- c(1L, 2L, 3L) # Integer vector 35 | complex_vector <- c(1+2i, 3-4i) # Complex vector 36 | print(integer_vector) 37 | print(complex_vector) 38 | ``` 39 | 40 | ### 3. **Accessing Vector Elements** 41 | 42 | You can access elements in a vector by their index using square brackets `[]`. R uses 1-based indexing. 43 | 44 | ```r 45 | vector <- c("a", "b", "c", "d", "e") 46 | vector[1] # Access first element 47 | vector[2:4] # Access elements from index 2 to 4 48 | vector[c(1, 3, 5)] # Access elements at specified positions 49 | ``` 50 | 51 | ### 4. **Modifying Vectors** 52 | 53 | Vectors in R are mutable, meaning you can change elements in a vector by accessing their indices. 54 | 55 | ```r 56 | numeric_vector <- c(1, 2, 3, 4, 5) 57 | numeric_vector[2] <- 10 # Change second element 58 | print(numeric_vector) # Output: 1 10 3 4 5 59 | ``` 60 | 61 | ### 5. **Vector Operations** 62 | 63 | #### Arithmetic Operations 64 | R supports arithmetic operations on vectors, and operations are performed element-wise. 65 | 66 | ```r 67 | a <- c(1, 2, 3) 68 | b <- c(4, 5, 6) 69 | a + b # Addition: Output: 5 7 9 70 | a * b # Multiplication: Output: 4 10 18 71 | a - b # Subtraction: Output: -3 -3 -3 72 | ``` 73 | 74 | #### Logical Operations 75 | Logical operations return a logical vector with comparisons done element-wise. 76 | 77 | ```r 78 | a <- c(1, 2, 3) 79 | b <- c(3, 2, 1) 80 | a > b # Output: FALSE FALSE TRUE 81 | a == b # Output: FALSE TRUE FALSE 82 | ``` 83 | 84 | ### 6. **Common Vector Functions** 85 | 86 | #### `length()`: Get Vector Length 87 | Returns the number of elements in a vector. 88 | 89 | ```r 90 | vector <- c("apple", "banana", "cherry") 91 | length(vector) # Output: 3 92 | ``` 93 | 94 | #### `sum()`, `mean()`, `min()`, `max()`: Summary Functions 95 | These functions provide common summary statistics. 96 | 97 | ```r 98 | numeric_vector <- c(1, 2, 3, 4, 5) 99 | sum(numeric_vector) # Output: 15 100 | mean(numeric_vector) # Output: 3 101 | min(numeric_vector) # Output: 1 102 | max(numeric_vector) # Output: 5 103 | ``` 104 | 105 | #### `sort()`: Sort Vector 106 | Sorts elements in ascending or descending order. 107 | 108 | ```r 109 | vector <- c(5, 2, 8, 1) 110 | sort(vector) # Output: 1 2 5 8 111 | sort(vector, decreasing = TRUE) # Output: 8 5 2 1 112 | ``` 113 | 114 | ### 7. **Vector Recycling** 115 | 116 | If vectors of different lengths are used in an operation, R "recycles" the shorter vector to match the length of the longer one. 117 | 118 | ```r 119 | a <- c(1, 2, 3) 120 | b <- c(4, 5) 121 | a + b # Output: 5 7 7 (b is recycled as c(4, 5, 4)) 122 | ``` 123 | 124 | ### 8. **Filtering Vectors** 125 | 126 | Use logical conditions to filter elements within a vector. 127 | 128 | ```r 129 | numeric_vector <- c(1, 2, 3, 4, 5) 130 | numeric_vector[numeric_vector > 3] # Output: 4 5 131 | ``` 132 | 133 | ### Summary Table of R Vector Functions 134 | 135 | | Function | Description | Example | 136 | |---------------|----------------------------------------|--------------------------------| 137 | | `c()` | Combine elements into a vector | `c(1, 2, 3)` | 138 | | `length()` | Get number of elements in vector | `length(vector)` | 139 | | `sum()` | Sum of vector elements | `sum(c(1, 2, 3))` | 140 | | `mean()` | Mean of vector elements | `mean(c(1, 2, 3))` | 141 | | `min()` | Minimum value in vector | `min(c(1, 2, 3))` | 142 | | `max()` | Maximum value in vector | `max(c(1, 2, 3))` | 143 | | `sort()` | Sort elements in vector | `sort(c(3, 1, 2))` | 144 | | `vector[i]` | Access ith element | `vector[1]` | 145 | | `vector[cond]`| Filter elements by condition | `vector[vector > 2]` | 146 | 147 | This guide provides a comprehensive overview of vectors in R, from creation and access to various operations and functions. Use these concepts to work efficiently with vector data in R. 148 | -------------------------------------------------------------------------------- /16 Day R Vectors/Vectors.r: -------------------------------------------------------------------------------- 1 | # Creating numeric, character, and logical vectors 2 | numeric_vector <- c(1, 2, 3, 4, 5) 3 | character_vector <- c("apple", "banana", "cherry") 4 | logical_vector <- c(TRUE, FALSE, TRUE, FALSE) 5 | 6 | # Performing operations on numeric vectors 7 | vector1 <- c(1, 2, 3) 8 | vector2 <- c(4, 5, 6) 9 | result_addition <- vector1 + vector2 10 | result_multiplication <- vector1 * vector2 11 | 12 | # Accessing elements of a vector 13 | fruit <- character_vector[2] 14 | 15 | # Finding the length, sum, and mean of a numeric vector 16 | vector_length <- length(numeric_vector) 17 | vector_sum <- sum(numeric_vector) 18 | vector_mean <- mean(numeric_vector) 19 | 20 | # Element-wise comparisons on logical vectors 21 | comparison_result <- logical_vector & c(TRUE, TRUE, FALSE, FALSE) 22 | 23 | # Vectorized operations 24 | numeric_vector <- c(1, 2, 3, 4, 5) 25 | square_vector <- numeric_vector^2 26 | 27 | # Assigning names to vector elements 28 | named_vector <- c(a = 10, b = 20, c = 30) 29 | 30 | # Creating a factor 31 | gender <- c("Male", "Female", "Male", "Female") 32 | gender_factor <- factor(gender) 33 | 34 | # Output 35 | cat("Numeric Vector:", numeric_vector, "\n") 36 | cat("Character Vector:", character_vector, "\n") 37 | cat("Logical Vector:", logical_vector, "\n") 38 | cat("Result of Addition:", result_addition, "\n") 39 | cat("Result of Multiplication:", result_multiplication, "\n") 40 | cat("Accessed Element:", fruit, "\n") 41 | cat("Length of Numeric Vector:", vector_length, "\n") 42 | cat("Sum of Numeric Vector:", vector_sum, "\n") 43 | cat("Mean of Numeric Vector:", vector_mean, "\n") 44 | cat("Logical Comparison Result:", comparison_result, "\n") 45 | cat("Vectorized Square:", square_vector, "\n") 46 | cat("Named Vector:", named_vector, "\n") 47 | cat("Gender Factor:", gender_factor, "\n") 48 | -------------------------------------------------------------------------------- /17 Day R Matrix/Matrix.r: -------------------------------------------------------------------------------- 1 | # Creating a numeric matrix 2 | numeric_matrix <- matrix(data = 1:12, nrow = 3, ncol = 4) 3 | 4 | # Creating a character matrix with row and column names 5 | char_matrix <- matrix(data = c("A", "B", "C", "D"), nrow = 2, ncol = 2, 6 | dimnames = list(c("Row1", "Row2"), c("Col1", "Col2"))) 7 | 8 | # Accessing elements of a matrix 9 | element1 <- numeric_matrix[2, 3] # Row 2, Column 3 10 | element2 <- char_matrix[1, 2] # Row 1, Column 2 11 | 12 | # Transpose a matrix 13 | transposed_matrix <- t(numeric_matrix) 14 | 15 | # Element-wise operations 16 | addition_matrix <- numeric_matrix + 10 17 | 18 | # Combining matrices vertically 19 | matrix3 <- matrix(13:16, nrow = 2) 20 | combined_matrix <- rbind(numeric_matrix, matrix3) 21 | 22 | # Output 23 | cat("Numeric Matrix:\n") 24 | print(numeric_matrix) 25 | cat("\nCharacter Matrix:\n") 26 | print(char_matrix) 27 | cat("\nAccessed Element 1:", element1, "\n") 28 | cat("Accessed Element 2:", element2, "\n") 29 | cat("\nTransposed Matrix:\n") 30 | print(transposed_matrix) 31 | cat("\nMatrix After Addition:\n") 32 | print(addition_matrix) 33 | cat("\nCombined Matrix:\n") 34 | print(combined_matrix) 35 | -------------------------------------------------------------------------------- /17 Day R Matrix/Question.md: -------------------------------------------------------------------------------- 1 | 2 | ### Questions 3 | 4 | 1. **Basic Matrix Creation** 5 | - Create a 5x2 matrix with numbers from 1 to 10. Display the matrix. 6 | 7 | ```R 8 | # Your code here 9 | ``` 10 | 11 | 2. **Matrix by Rows** 12 | - Create a 4x3 matrix with numbers from 1 to 12, filled by rows. Display the matrix. 13 | 14 | ```R 15 | # Your code here 16 | ``` 17 | 18 | 3. **Named Matrix** 19 | - Create a 3x3 matrix with numbers from 1 to 9. Name the rows as "R1", "R2", "R3" and the columns as "C1", "C2", "C3". Display the matrix. 20 | 21 | ```R 22 | # Your code here 23 | ``` 24 | 25 | 4. **Accessing Matrix Elements** 26 | - Given the following matrix: 27 | 28 | ```R 29 | access_matrix <- matrix(1:9, nrow = 3, ncol = 3) 30 | ``` 31 | 32 | Write code to access the element in the 3rd row and 2nd column. 33 | 34 | ```R 35 | # Your code here 36 | ``` 37 | 38 | 5. **Matrix Addition** 39 | - Create two 2x2 matrices: 40 | 41 | ```R 42 | matrix_a <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2) 43 | matrix_b <- matrix(c(4, 3, 2, 1), nrow = 2, ncol = 2) 44 | ``` 45 | 46 | Add these two matrices together and display the result. 47 | 48 | ```R 49 | # Your code here 50 | ``` 51 | 52 | 6. **Matrix Multiplication** 53 | - Given the matrices: 54 | 55 | ```R 56 | matrix_x <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2) 57 | matrix_y <- matrix(c(5, 6, 7, 8), nrow = 2, ncol = 2) 58 | ``` 59 | 60 | Multiply these two matrices together and display the result. 61 | 62 | ```R 63 | # Your code here 64 | ``` 65 | 66 | 7. **Transpose of a Matrix** 67 | - Create a 2x3 matrix with numbers from 1 to 6. Transpose the matrix and display the result. 68 | 69 | ```R 70 | # Your code here 71 | ``` 72 | 73 | 8. **Element-wise Operations** 74 | - Create a 3x3 matrix with numbers from 1 to 9. Multiply each element by 2 and display the resulting matrix. 75 | 76 | ```R 77 | # Your code here 78 | ``` 79 | 80 | 9. **Combining Matrices** 81 | - Create two matrices: 82 | 83 | ```R 84 | mat1 <- matrix(1:4, nrow = 2) 85 | mat2 <- matrix(5:8, nrow = 2) 86 | ``` 87 | 88 | Combine these matrices by columns and display the resulting matrix. 89 | 90 | ```R 91 | # Your code here 92 | ``` 93 | 94 | 10. **Matrix Determinant (Advanced)** 95 | - Create a 2x2 matrix and calculate its determinant. 96 | 97 | ```R 98 | det_matrix <- matrix(c(4, 2, 3, 1), nrow = 2) 99 | # Your code here 100 | ``` 101 | 102 | ### Answers 103 | 104 | Below are the solutions to the above questions: 105 | 106 | 1. **Basic Matrix Creation** 107 | 108 | ```R 109 | matrix_5x2 <- matrix(1:10, nrow = 5, ncol = 2) 110 | print(matrix_5x2) 111 | ``` 112 | 113 | 2. **Matrix by Rows** 114 | 115 | ```R 116 | matrix_4x3_byrow <- matrix(1:12, nrow = 4, ncol = 3, byrow = TRUE) 117 | print(matrix_4x3_byrow) 118 | ``` 119 | 120 | 3. **Named Matrix** 121 | 122 | ```R 123 | row_names <- c("R1", "R2", "R3") 124 | col_names <- c("C1", "C2", "C3") 125 | named_matrix <- matrix(1:9, nrow = 3, ncol = 3, dimnames = list(row_names, col_names)) 126 | print(named_matrix) 127 | ``` 128 | 129 | 4. **Accessing Matrix Elements** 130 | 131 | ```R 132 | element <- access_matrix[3, 2] 133 | print(element) # Output: 8 134 | ``` 135 | 136 | 5. **Matrix Addition** 137 | 138 | ```R 139 | sum_matrix <- matrix_a + matrix_b 140 | print(sum_matrix) 141 | ``` 142 | 143 | 6. **Matrix Multiplication** 144 | 145 | ```R 146 | product_matrix <- matrix_x %*% matrix_y 147 | print(product_matrix) 148 | ``` 149 | 150 | 7. **Transpose of a Matrix** 151 | 152 | ```R 153 | matrix_2x3 <- matrix(1:6, nrow = 2, ncol = 3) 154 | transposed_matrix <- t(matrix_2x3) 155 | print(transposed_matrix) 156 | ``` 157 | 158 | 8. **Element-wise Operations** 159 | 160 | ```R 161 | matrix_3x3 <- matrix(1:9, nrow = 3, ncol = 3) 162 | multiplied_matrix <- matrix_3x3 * 2 163 | print(multiplied_matrix) 164 | ``` 165 | 166 | 9. **Combining Matrices** 167 | 168 | ```R 169 | combined_matrix <- cbind(mat1, mat2) 170 | print(combined_matrix) 171 | ``` 172 | 173 | 10. **Matrix Determinant (Advanced)** 174 | 175 | ```R 176 | determinant <- det(det_matrix) 177 | print(determinant) 178 | ``` 179 | 180 | -------------------------------------------------------------------------------- /17 Day R Matrix/README.md: -------------------------------------------------------------------------------- 1 | # R Matrices in Detail 2 | 3 | **Introduction:** 4 | Matrices are a fundamental data structure in R, used to store data in a two-dimensional (rows and columns) format. Matrices can only store elements of the same type, such as numeric, character, or logical data. This tutorial covers creating, manipulating, and performing operations on matrices in R. 5 | 6 | ### 1. **Creating Matrices** 7 | 8 | Matrices in R are created using the `matrix()` function. You can specify the number of rows and columns and how elements are filled. 9 | 10 | ```r 11 | # Create a 3x3 numeric matrix 12 | matrix_data <- matrix(1:9, nrow = 3, ncol = 3) 13 | print(matrix_data) 14 | 15 | # Create a matrix by filling elements by row 16 | matrix_data_byrow <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE) 17 | print(matrix_data_byrow) 18 | ``` 19 | 20 | ### 2. **Matrix Dimensions** 21 | 22 | #### `nrow()` and `ncol()`: Get Number of Rows and Columns 23 | The `nrow()` and `ncol()` functions return the number of rows and columns in a matrix. 24 | 25 | ```r 26 | nrow(matrix_data) # Output: 3 27 | ncol(matrix_data) # Output: 3 28 | ``` 29 | 30 | #### `dim()`: Get Dimensions 31 | The `dim()` function returns a vector with the number of rows and columns. 32 | 33 | ```r 34 | dim(matrix_data) # Output: 3 3 35 | ``` 36 | 37 | ### 3. **Accessing Matrix Elements** 38 | 39 | Matrix elements can be accessed using row and column indices within square brackets `[]`. Use `[row, column]` format. 40 | 41 | ```r 42 | # Access element at row 2, column 3 43 | matrix_data[2, 3] # Output: 6 44 | 45 | # Access entire row or column 46 | matrix_data[1, ] # First row: Output: 1 4 7 47 | matrix_data[, 2] # Second column: Output: 2 5 8 48 | ``` 49 | 50 | ### 4. **Modifying Matrix Elements** 51 | 52 | You can modify elements in a matrix by specifying their positions. 53 | 54 | ```r 55 | # Replace the element at row 1, column 1 with 10 56 | matrix_data[1, 1] <- 10 57 | print(matrix_data) 58 | ``` 59 | 60 | ### 5. **Adding Rows and Columns** 61 | 62 | #### `rbind()`: Add Rows 63 | The `rbind()` function adds one or more rows to an existing matrix. 64 | 65 | ```r 66 | new_row <- c(10, 11, 12) 67 | matrix_data <- rbind(matrix_data, new_row) 68 | print(matrix_data) 69 | ``` 70 | 71 | #### `cbind()`: Add Columns 72 | The `cbind()` function adds one or more columns to an existing matrix. 73 | 74 | ```r 75 | new_column <- c(13, 14, 15, 16) 76 | matrix_data <- cbind(matrix_data, new_column) 77 | print(matrix_data) 78 | ``` 79 | 80 | ### 6. **Matrix Operations** 81 | 82 | #### Arithmetic Operations 83 | R allows element-wise arithmetic operations on matrices. 84 | 85 | ```r 86 | matrix1 <- matrix(1:4, nrow = 2) 87 | matrix2 <- matrix(5:8, nrow = 2) 88 | matrix1 + matrix2 # Addition 89 | matrix1 * matrix2 # Element-wise multiplication 90 | ``` 91 | 92 | #### Matrix Multiplication: `%*%` 93 | Use `%*%` for matrix multiplication following linear algebra rules. 94 | 95 | ```r 96 | matrix1 <- matrix(c(1, 2, 3, 4), nrow = 2) 97 | matrix2 <- matrix(c(5, 6, 7, 8), nrow = 2) 98 | matrix1 %*% matrix2 99 | ``` 100 | 101 | ### 7. **Transpose and Inverse of a Matrix** 102 | 103 | #### `t()`: Transpose 104 | The `t()` function transposes a matrix, swapping rows with columns. 105 | 106 | ```r 107 | transpose_matrix <- t(matrix_data) 108 | print(transpose_matrix) 109 | ``` 110 | 111 | #### `solve()`: Inverse of a Matrix 112 | The `solve()` function calculates the inverse of a square matrix. 113 | 114 | ```r 115 | square_matrix <- matrix(c(4, 7, 2, 6), nrow = 2) 116 | inverse_matrix <- solve(square_matrix) 117 | print(inverse_matrix) 118 | ``` 119 | 120 | ### 8. **Applying Functions to Rows and Columns** 121 | 122 | #### `apply()`: Apply Functions to Rows or Columns 123 | The `apply()` function applies a function to each row or column. 124 | 125 | ```r 126 | # Sum of each row 127 | apply(matrix_data, 1, sum) 128 | 129 | # Mean of each column 130 | apply(matrix_data, 2, mean) 131 | ``` 132 | 133 | ### 9. **Matrix Indexing and Filtering** 134 | 135 | You can use logical conditions to filter elements within a matrix. 136 | 137 | ```r 138 | # Create a matrix and filter elements greater than 5 139 | matrix_data <- matrix(1:9, nrow = 3) 140 | matrix_data[matrix_data > 5] # Output: 6 7 8 9 141 | ``` 142 | 143 | ### Summary Table of R Matrix Functions 144 | 145 | | Function | Description | Example | 146 | |---------------|---------------------------------------|--------------------------------------| 147 | | `matrix()` | Create a matrix | `matrix(1:9, nrow=3, ncol=3)` | 148 | | `nrow()` | Number of rows | `nrow(matrix_data)` | 149 | | `ncol()` | Number of columns | `ncol(matrix_data)` | 150 | | `dim()` | Dimensions of the matrix | `dim(matrix_data)` | 151 | | `t()` | Transpose a matrix | `t(matrix_data)` | 152 | | `solve()` | Inverse of a square matrix | `solve(square_matrix)` | 153 | | `apply()` | Apply function to rows or columns | `apply(matrix_data, 1, sum)` | 154 | | `rbind()` | Add rows to a matrix | `rbind(matrix_data, new_row)` | 155 | | `cbind()` | Add columns to a matrix | `cbind(matrix_data, new_column)` | 156 | | `%*%` | Matrix multiplication | `matrix1 %*% matrix2` | 157 | | `matrix[i,j]` | Access element at ith row, jth column | `matrix_data[2, 3]` | 158 | 159 | This guide offers a complete overview of matrix operations in R, including matrix creation, manipulation, and mathematical operations. Use these techniques to efficiently handle matrix data in R. 160 | -------------------------------------------------------------------------------- /18 Day R List/List Methods.md: -------------------------------------------------------------------------------- 1 | ### **List of Key Methods for Working with Lists in R** 2 | Welcome to **Codes With Pankaj**! Below is a detailed guide to the most commonly used methods for handling lists in R, with examples to help you understand each one. 3 | 4 | --- 5 | 6 | ### **1. `length()`** 7 | The `length()` function returns the number of elements in a list. 8 | 9 | #### Example: 10 | ```R 11 | my_list <- list(name = "Pankaj", age = 25, scores = c(88, 90, 78)) 12 | length(my_list) 13 | ``` 14 | #### Output: 15 | ``` 16 | [1] 3 17 | ``` 18 | 19 | --- 20 | 21 | ### **2. `names()`** 22 | The `names()` function is used to get or set the names of list elements. 23 | 24 | #### Example: Getting Names 25 | ```R 26 | names(my_list) 27 | ``` 28 | #### Output: 29 | ``` 30 | [1] "name" "age" "scores" 31 | ``` 32 | 33 | #### Example: Setting Names 34 | ```R 35 | names(my_list) <- c("Name", "Age", "ExamScores") 36 | print(my_list) 37 | ``` 38 | #### Output: 39 | ``` 40 | $Name 41 | [1] "Pankaj" 42 | 43 | $Age 44 | [1] 25 45 | 46 | $ExamScores 47 | [1] 88 90 78 48 | ``` 49 | 50 | --- 51 | 52 | ### **3. `unlist()`** 53 | The `unlist()` function converts a list into a vector by flattening its structure. 54 | 55 | #### Example: 56 | ```R 57 | flat_vector <- unlist(my_list) 58 | print(flat_vector) 59 | ``` 60 | #### Output: 61 | ``` 62 | Name Age ExamScores1 ExamScores2 ExamScores3 63 | "Pankaj" 25 88 90 78 64 | ``` 65 | 66 | --- 67 | 68 | ### **4. `lapply()`** 69 | The `lapply()` function applies a function to each element of a list and returns the results as a list. 70 | 71 | #### Example: 72 | ```R 73 | score_list <- list(10, 20, 30) 74 | doubled_scores <- lapply(score_list, function(x) x * 2) 75 | print(doubled_scores) 76 | ``` 77 | #### Output: 78 | ``` 79 | [[1]] 80 | [1] 20 81 | 82 | [[2]] 83 | [1] 40 84 | 85 | [[3]] 86 | [1] 60 87 | ``` 88 | 89 | --- 90 | 91 | ### **5. `sapply()`** 92 | The `sapply()` function is a simplified version of `lapply()`. It tries to return a vector or matrix instead of a list. 93 | 94 | #### Example: 95 | ```R 96 | score_list <- list(c(10, 20), c(30, 40)) 97 | sums <- sapply(score_list, sum) 98 | print(sums) 99 | ``` 100 | #### Output: 101 | ``` 102 | [1] 30 70 103 | ``` 104 | 105 | --- 106 | 107 | ### **6. `append()`** 108 | The `append()` function adds elements to a list. 109 | 110 | #### Example: 111 | ```R 112 | my_list <- list(name = "Pankaj", age = 25) 113 | my_list <- append(my_list, list(hobby = "Coding")) 114 | print(my_list) 115 | ``` 116 | #### Output: 117 | ``` 118 | $name 119 | [1] "Pankaj" 120 | 121 | $age 122 | [1] 25 123 | 124 | $hobby 125 | [1] "Coding" 126 | ``` 127 | 128 | --- 129 | 130 | ### **7. `str()`** 131 | The `str()` function provides the structure of a list, making it easier to inspect complex lists. 132 | 133 | #### Example: 134 | ```R 135 | str(my_list) 136 | ``` 137 | #### Output: 138 | ``` 139 | List of 3 140 | $ name : chr "Pankaj" 141 | $ age : num 25 142 | $ hobby: chr "Coding" 143 | ``` 144 | 145 | --- 146 | 147 | ### **8. `is.list()`** 148 | The `is.list()` function checks if an object is a list and returns `TRUE` or `FALSE`. 149 | 150 | #### Example: 151 | ```R 152 | is_list <- is.list(my_list) 153 | print(is_list) 154 | ``` 155 | #### Output: 156 | ``` 157 | [1] TRUE 158 | ``` 159 | 160 | --- 161 | 162 | ### **9. `as.list()`** 163 | The `as.list()` function converts other objects (e.g., vectors) into a list. 164 | 165 | #### Example: 166 | ```R 167 | vec <- c(1, 2, 3) 168 | list_from_vec <- as.list(vec) 169 | print(list_from_vec) 170 | ``` 171 | #### Output: 172 | ``` 173 | [[1]] 174 | [1] 1 175 | 176 | [[2]] 177 | [1] 2 178 | 179 | [[3]] 180 | [1] 3 181 | ``` 182 | 183 | --- 184 | 185 | ### **10. `rapply()`** 186 | The `rapply()` function applies a function recursively to each element of a list. 187 | 188 | #### Example: 189 | ```R 190 | nested_list <- list(a = list(1, 2), b = list(3, 4)) 191 | result <- rapply(nested_list, function(x) x * 2, how = "list") 192 | print(result) 193 | ``` 194 | #### Output: 195 | ``` 196 | $a 197 | $a[[1]] 198 | [1] 2 199 | 200 | $a[[2]] 201 | [1] 4 202 | 203 | $b 204 | $b[[1]] 205 | [1] 6 206 | 207 | $b[[2]] 208 | [1] 8 209 | ``` 210 | 211 | --- 212 | 213 | ### **11. `duplicated()`** 214 | The `duplicated()` function checks for duplicate elements in a list. 215 | 216 | #### Example: 217 | ```R 218 | my_list <- list(1, 2, 3, 2, 1) 219 | duplicates <- duplicated(my_list) 220 | print(duplicates) 221 | ``` 222 | #### Output: 223 | ``` 224 | [1] FALSE FALSE FALSE TRUE TRUE 225 | ``` 226 | 227 | --- 228 | 229 | ### **12. `rev()`** 230 | The `rev()` function reverses the order of elements in a list. 231 | 232 | #### Example: 233 | ```R 234 | my_list <- list(1, 2, 3, 4) 235 | reversed_list <- rev(my_list) 236 | print(reversed_list) 237 | ``` 238 | #### Output: 239 | ``` 240 | [[1]] 241 | [1] 4 242 | 243 | [[2]] 244 | [1] 3 245 | 246 | [[3]] 247 | [1] 2 248 | 249 | [[4]] 250 | [1] 1 251 | ``` 252 | 253 | --- 254 | 255 | ### **13. `do.call()`** 256 | The `do.call()` function executes a function using a list of arguments. 257 | 258 | #### Example: 259 | ```R 260 | args_list <- list(1, 2, 3) 261 | sum_result <- do.call(sum, args_list) 262 | print(sum_result) 263 | ``` 264 | #### Output: 265 | ``` 266 | [1] 6 267 | ``` 268 | 269 | --- 270 | 271 | ### **14. `Filter()`** 272 | The `Filter()` function applies a condition to filter elements from a list. 273 | 274 | #### Example: 275 | ```R 276 | my_list <- list(1, 2, 3, 4, 5) 277 | filtered_list <- Filter(function(x) x > 3, my_list) 278 | print(filtered_list) 279 | ``` 280 | #### Output: 281 | ``` 282 | [[1]] 283 | [1] 4 284 | 285 | [[2]] 286 | [1] 5 287 | ``` 288 | 289 | --- 290 | 291 | ### **15. `Map()`** 292 | The `Map()` function applies a function to corresponding elements of multiple lists. 293 | 294 | #### Example: 295 | ```R 296 | list1 <- list(1, 2, 3) 297 | list2 <- list(4, 5, 6) 298 | result <- Map(function(x, y) x + y, list1, list2) 299 | print(result) 300 | ``` 301 | #### Output: 302 | ``` 303 | [[1]] 304 | [1] 5 305 | 306 | [[2]] 307 | [1] 7 308 | 309 | [[3]] 310 | [1] 9 311 | ``` 312 | 313 | --- 314 | 315 | ### **Summary of Methods** 316 | 317 | | **Function** | **Description** | 318 | |--------------|----------------------------------| 319 | | `length()` | Returns the number of elements. | 320 | | `names()` | Gets or sets the names. | 321 | | `unlist()` | Flattens a list to a vector. | 322 | | `lapply()` | Applies a function to elements. | 323 | | `sapply()` | Simplified version of `lapply`. | 324 | | `append()` | Adds elements to a list. | 325 | | `str()` | Displays list structure. | 326 | | `is.list()` | Checks if an object is a list. | 327 | | `as.list()` | Converts an object to a list. | 328 | | `rapply()` | Recursive version of `lapply`. | 329 | | `duplicated()` | Finds duplicate elements. | 330 | | `rev()` | Reverses list elements. | 331 | | `do.call()` | Executes a function on a list. | 332 | | `Filter()` | Filters elements by condition. | 333 | | `Map()` | Applies a function to multiple lists. | 334 | 335 | --- 336 | 337 | With these methods and examples, you’re equipped to handle lists effectively in R. Try them out and take your skills to the next level with **Codes With Pankaj**! 🚀 338 | -------------------------------------------------------------------------------- /18 Day R List/README.md: -------------------------------------------------------------------------------- 1 | # **Tutorial: R List Basics with "Codes with Pankaj"** 2 | 3 | Welcome to **Codes with Pankaj**, where we break down R programming concepts step by step! In this tutorial, we'll explore **R Lists**, one of the most versatile and powerful data structures in R. 4 | 5 | --- 6 | 7 | ## **What is an R List?** 8 | 9 | An R **list** is a collection of elements that can be of different types (e.g., numbers, characters, vectors, data frames, etc.). Think of a list as a container that can hold various objects of different types and sizes. 10 | 11 | ### **Why Use Lists?** 12 | - To group related data of different types. 13 | - To store complex structures like models, functions, or other lists. 14 | 15 | --- 16 | 17 | ## **Step-by-Step Guide** 18 | 19 | ### **1. Creating a List** 20 | 21 | A list is created using the `list()` function. 22 | 23 | #### **Syntax:** 24 | ```R 25 | list(element1, element2, ..., elementN) 26 | ``` 27 | 28 | #### **Example:** 29 | ```R 30 | my_list <- list( 31 | name = "Pankaj", 32 | age = 28, 33 | hobbies = c("coding", "reading", "traveling") 34 | ) 35 | print(my_list) 36 | ``` 37 | 38 | #### **Output:** 39 | ``` 40 | $name 41 | [1] "Pankaj" 42 | 43 | $age 44 | [1] 28 45 | 46 | $hobbies 47 | [1] "coding" "reading" "traveling" 48 | ``` 49 | 50 | ### **Explanation:** 51 | - `name` is a **character** element. 52 | - `age` is a **numeric** element. 53 | - `hobbies` is a **vector** containing strings. 54 | 55 | --- 56 | 57 | ### **2. Accessing List Elements** 58 | 59 | #### **Method 1: Using `$` Operator** 60 | The `$` operator accesses elements by name. 61 | ```R 62 | print(my_list$name) 63 | # Output: "Pankaj" 64 | ``` 65 | 66 | #### **Method 2: Using Double Square Brackets `[[ ]]`** 67 | Access elements by index or name. 68 | ```R 69 | print(my_list[[2]]) 70 | # Output: 28 71 | ``` 72 | 73 | #### **Method 3: Using Single Square Brackets `[ ]`** 74 | This returns a **sublist**. 75 | ```R 76 | print(my_list[2]) 77 | # Output: 78 | $age 79 | [1] 28 80 | ``` 81 | 82 | ### **Key Difference:** 83 | - `[[ ]]` returns the actual element. 84 | - `[ ]` returns a list containing the element. 85 | 86 | --- 87 | 88 | ### **3. Modifying a List** 89 | 90 | You can update or add elements to a list. 91 | 92 | #### **Example:** 93 | ```R 94 | # Update an element 95 | my_list$age <- 29 96 | print(my_list$age) # Output: 29 97 | 98 | # Add a new element 99 | my_list$profession <- "Data Scientist" 100 | print(my_list$profession) # Output: "Data Scientist" 101 | ``` 102 | 103 | --- 104 | 105 | ### **4. Removing Elements** 106 | 107 | To remove elements, set them to `NULL`. 108 | 109 | #### **Example:** 110 | ```R 111 | my_list$profession <- NULL 112 | print(my_list) 113 | # The "profession" element will be removed. 114 | ``` 115 | 116 | --- 117 | 118 | ### **5. Applying Functions to Lists** 119 | 120 | Use `lapply()` and `sapply()` to apply functions to list elements. 121 | 122 | #### **lapply() Example:** 123 | ```R 124 | numbers <- list(a = 1:5, b = 6:10) 125 | result <- lapply(numbers, sum) 126 | print(result) 127 | # Output: List with sums of each vector 128 | ``` 129 | 130 | #### **sapply() Example:** 131 | ```R 132 | result <- sapply(numbers, sum) 133 | print(result) 134 | # Output: A simplified vector of sums 135 | ``` 136 | 137 | --- 138 | 139 | ### **6. Combining Lists** 140 | 141 | Use the `c()` function to combine lists. 142 | 143 | #### **Example:** 144 | ```R 145 | list1 <- list(a = 1, b = 2) 146 | list2 <- list(c = 3, d = 4) 147 | combined <- c(list1, list2) 148 | print(combined) 149 | ``` 150 | 151 | --- 152 | 153 | ### **7. Nested Lists** 154 | 155 | Lists can contain other lists! 156 | 157 | #### **Example:** 158 | ```R 159 | nested_list <- list( 160 | personal = list(name = "Pankaj", age = 28), 161 | professional = list(title = "Data Scientist", experience = 5) 162 | ) 163 | print(nested_list) 164 | ``` 165 | 166 | --- 167 | 168 | ### **8. Converting a List to a Vector** 169 | 170 | If all list elements are of the same type, use `unlist()`. 171 | 172 | #### **Example:** 173 | ```R 174 | simple_list <- list(1, 2, 3, 4) 175 | vector <- unlist(simple_list) 176 | print(vector) 177 | # Output: [1] 1 2 3 4 178 | ``` 179 | 180 | --- 181 | 182 | ### **9. Checking List Properties** 183 | 184 | #### **Length of a List:** 185 | ```R 186 | print(length(my_list)) 187 | # Output: Number of elements in the list 188 | ``` 189 | 190 | #### **Type of Each Element:** 191 | ```R 192 | types <- sapply(my_list, class) 193 | print(types) 194 | ``` 195 | 196 | --- 197 | 198 | ### **10. Advanced: Using `str()` for Structure** 199 | The `str()` function gives a compact view of the list structure. 200 | 201 | #### **Example:** 202 | ```R 203 | str(my_list) 204 | ``` 205 | 206 | --- 207 | 208 | ## **Summary** 209 | 210 | | Task | Function/Method | 211 | |----------------------------|-----------------------| 212 | | Create a list | `list()` | 213 | | Access elements | `$`, `[[ ]]`, `[ ]` | 214 | | Modify/add elements | Assign values | 215 | | Remove elements | Assign `NULL` | 216 | | Apply functions | `lapply()`, `sapply()` | 217 | | Combine lists | `c()` | 218 | | Nested lists | Create lists inside lists | 219 | | Convert to vector | `unlist()` | 220 | 221 | --- 222 | 223 | ## **Your Turn!** 224 | 225 | Try creating your own lists with different types of data. Play around with accessing and modifying elements to see how flexible and powerful lists can be! 226 | 227 | Stay tuned for more tutorials with **Codes with Pankaj**! 🚀 228 | 229 | -------------------------------------------------------------------------------- /18 Day R List/list.r: -------------------------------------------------------------------------------- 1 | # Creating a list with various data types 2 | my_list <- list(name = "Alice", age = 30, scores = c(85, 92, 78), has_pet = TRUE) 3 | 4 | # Accessing list elements by name 5 | name <- my_list$name 6 | age <- my_list$age 7 | 8 | # Accessing list elements by position 9 | first_score <- my_list[[3]][1] 10 | 11 | # Modifying the list 12 | my_list$city <- "New York" 13 | my_list$age <- 31 14 | 15 | # Removing an element from the list 16 | my_list$city <- NULL 17 | 18 | # Creating a nested list 19 | nested_list <- list(person1 = list(name = "Bob", age = 25), 20 | person2 = list(name = "Alice", age = 30)) 21 | 22 | # Accessing elements of the nested list 23 | person1_name <- nested_list$person1$name 24 | person2_age <- nested_list$person2$age 25 | 26 | # Output 27 | cat("Name:", name, "\n") 28 | cat("Age:", age, "\n") 29 | cat("First Score:", first_score, "\n") 30 | cat("Modified List:\n") 31 | print(my_list) 32 | cat("Nested List:\n") 33 | print(nested_list) 34 | cat("Person1 Name:", person1_name, "\n") 35 | cat("Person2 Age:", person2_age, "\n") 36 | -------------------------------------------------------------------------------- /19 Day R Array/README.md: -------------------------------------------------------------------------------- 1 | # **Tutorial: R Array Basics with "Codes with Pankaj"** 2 | 3 | Welcome to **Codes with Pankaj**! Today, we'll dive into **R Arrays**, a versatile data structure for storing multi-dimensional data. Arrays in R allow you to work efficiently with structured data in two or more dimensions. 4 | 5 | --- 6 | 7 | ## **What is an R Array?** 8 | 9 | An R **array** is a data structure that can store elements of the same type (numeric, character, etc.) in two or more dimensions. Arrays extend the concept of matrices by supporting dimensions beyond 2D. 10 | 11 | --- 12 | 13 | ## **Step-by-Step Guide** 14 | 15 | ### **1. Creating an Array** 16 | 17 | The `array()` function is used to create an array. 18 | 19 | #### **Syntax:** 20 | ```R 21 | array(data, dim, dimnames) 22 | ``` 23 | 24 | - **data**: The vector to be stored in the array. 25 | - **dim**: A numeric vector specifying the dimensions of the array. 26 | - **dimnames**: A list of names for the dimensions. 27 | 28 | #### **Example:** 29 | ```R 30 | my_array <- array( 31 | data = 1:12, 32 | dim = c(3, 4), 33 | dimnames = list( 34 | rows = c("Row1", "Row2", "Row3"), 35 | cols = c("Col1", "Col2", "Col3", "Col4") 36 | ) 37 | ) 38 | print(my_array) 39 | ``` 40 | 41 | #### **Output:** 42 | ``` 43 | Col1 Col2 Col3 Col4 44 | Row1 1 4 7 10 45 | Row2 2 5 8 11 46 | Row3 3 6 9 12 47 | ``` 48 | 49 | ### **Explanation:** 50 | - `dim = c(3, 4)` creates an array with 3 rows and 4 columns. 51 | - `dimnames` assigns custom labels to rows and columns. 52 | 53 | --- 54 | 55 | ### **2. Multi-Dimensional Arrays** 56 | 57 | Arrays can have more than two dimensions. 58 | 59 | #### **Example:** 60 | ```R 61 | multi_array <- array( 62 | data = 1:24, 63 | dim = c(3, 4, 2), # 3 rows, 4 columns, 2 layers 64 | dimnames = list( 65 | rows = c("Row1", "Row2", "Row3"), 66 | cols = c("Col1", "Col2", "Col3", "Col4"), 67 | layers = c("Layer1", "Layer2") 68 | ) 69 | ) 70 | print(multi_array) 71 | ``` 72 | 73 | #### **Output:** 74 | ``` 75 | , , Layer1 76 | Col1 Col2 Col3 Col4 77 | Row1 1 4 7 10 78 | Row2 2 5 8 11 79 | Row3 3 6 9 12 80 | 81 | , , Layer2 82 | Col1 Col2 Col3 Col4 83 | Row1 13 16 19 22 84 | Row2 14 17 20 23 85 | Row3 15 18 21 24 86 | ``` 87 | 88 | --- 89 | 90 | ### **3. Accessing Array Elements** 91 | 92 | You can access elements using **indexing**. Specify indices for each dimension in square brackets. 93 | 94 | #### **Example:** 95 | ```R 96 | # Access a specific element 97 | print(multi_array[2, 3, 1]) 98 | # Output: 8 99 | 100 | # Access an entire row 101 | print(multi_array[2, , 1]) 102 | # Output: Row 2 in Layer 1 103 | 104 | # Access an entire column 105 | print(multi_array[, 3, 2]) 106 | # Output: Column 3 in Layer 2 107 | ``` 108 | 109 | --- 110 | 111 | ### **4. Modifying Array Elements** 112 | 113 | You can update specific elements or slices of the array. 114 | 115 | #### **Example:** 116 | ```R 117 | multi_array[2, 3, 1] <- 99 118 | print(multi_array[2, 3, 1]) # Output: 99 119 | ``` 120 | 121 | --- 122 | 123 | ### **5. Performing Operations on Arrays** 124 | 125 | R supports element-wise operations on arrays. 126 | 127 | #### **Example:** 128 | ```R 129 | # Add a scalar to all elements 130 | new_array <- multi_array + 10 131 | print(new_array) 132 | 133 | # Multiply by a scalar 134 | scaled_array <- multi_array * 2 135 | print(scaled_array) 136 | ``` 137 | 138 | --- 139 | 140 | ### **6. Applying Functions Across Dimensions** 141 | 142 | Use the `apply()` function to apply a function along specific dimensions. 143 | 144 | #### **Syntax:** 145 | ```R 146 | apply(array, MARGIN, FUN) 147 | ``` 148 | 149 | - **MARGIN**: Dimension to apply the function (1 = rows, 2 = columns, etc.). 150 | - **FUN**: The function to apply. 151 | 152 | #### **Example:** 153 | ```R 154 | # Calculate the sum of each row across layers 155 | row_sums <- apply(multi_array, MARGIN = 1, FUN = sum) 156 | print(row_sums) 157 | 158 | # Calculate the mean of each column 159 | col_means <- apply(multi_array, MARGIN = 2, FUN = mean) 160 | print(col_means) 161 | ``` 162 | 163 | --- 164 | 165 | ### **7. Combining Arrays** 166 | 167 | You can combine arrays along new dimensions using `abind()` from the **abind** package. 168 | 169 | #### **Example:** 170 | ```R 171 | install.packages("abind") 172 | library(abind) 173 | 174 | array1 <- array(1:12, dim = c(3, 4)) 175 | array2 <- array(13:24, dim = c(3, 4)) 176 | 177 | combined <- abind(array1, array2, along = 3) 178 | print(combined) 179 | ``` 180 | 181 | --- 182 | 183 | ### **8. Checking Array Properties** 184 | 185 | #### **Example:** 186 | ```R 187 | # Dimensions of the array 188 | print(dim(multi_array)) 189 | 190 | # Number of elements 191 | print(length(multi_array)) 192 | 193 | # Check structure 194 | str(multi_array) 195 | ``` 196 | 197 | --- 198 | 199 | ### **9. Array vs Matrix** 200 | 201 | - **Matrix**: Always 2D. 202 | - **Array**: Can have 2D or more dimensions. 203 | 204 | #### **Example of Conversion:** 205 | ```R 206 | # Convert matrix to array 207 | mat <- matrix(1:12, nrow = 3, ncol = 4) 208 | converted_array <- array(mat, dim = c(3, 4, 1)) 209 | print(converted_array) 210 | ``` 211 | 212 | --- 213 | 214 | ## **Summary** 215 | 216 | | Task | Function/Method | 217 | |---------------------------|-------------------------| 218 | | Create an array | `array()` | 219 | | Access elements | Indexing `[ , , ]` | 220 | | Modify elements | Assignment | 221 | | Perform operations | Arithmetic, `apply()` | 222 | | Combine arrays | `abind()` | 223 | | Check properties | `dim()`, `length()`, `str()` | 224 | 225 | --- 226 | 227 | ## **Your Turn!** 228 | 229 | Experiment with arrays by creating your own multi-dimensional structures. Use `apply()` to perform calculations across dimensions. Arrays are a key tool for data organization and manipulation in R. 230 | 231 | Stay tuned for more tutorials with **Codes with Pankaj**! 🚀 232 | -------------------------------------------------------------------------------- /19 Day R Array/array.r: -------------------------------------------------------------------------------- 1 | # Creating a 3-dimensional array 2 | data_array <- array(1:24, dim = c(3, 4, 2), 3 | dimnames = list(c("A", "B", "C"), c("X", "Y", "Z"), c("M", "N"))) 4 | 5 | # Accessing elements of the array 6 | element1 <- data_array[1, 2, 1] # Accessing element at (1, 2, 1) 7 | element2 <- data_array["A", "Y", "M"] # Accessing element by dimension names 8 | 9 | # Displaying the original array 10 | cat("Original Array:\n") 11 | print(data_array) 12 | 13 | # Transposing the array (changing the order of dimensions) 14 | transposed_array <- aperm(data_array, c(3, 2, 1)) 15 | 16 | # Displaying the transposed array 17 | cat("\nTransposed Array:\n") 18 | print(transposed_array) 19 | 20 | # Accessing elements of the transposed array 21 | transposed_element1 <- transposed_array[1, 2, 1] 22 | transposed_element2 <- transposed_array["M", "Y", "A"] 23 | 24 | # Output 25 | cat("\nElement 1 in Original Array:", element1, "\n") 26 | cat("Element 2 in Original Array:", element2, "\n") 27 | cat("\nElement 1 in Transposed Array:", transposed_element1, "\n") 28 | cat("Element 2 in Transposed Array:", transposed_element2, "\n") 29 | -------------------------------------------------------------------------------- /20 Day R Data Frame /Advanced-level practice questions.md: -------------------------------------------------------------------------------- 1 | # Advanced-level practice questions 2 | 3 | --- 4 | 5 | ### 1. **Data Transformation and Filtering** 6 | You are given a data frame `df` with columns: `ID`, `Name`, `Age`, `Salary`, and `Department`. Perform the following tasks: 7 | - Filter rows where `Age` is greater than 30 and `Salary` is above the median salary. 8 | - Add a new column `Bonus` which is 10% of the `Salary`. 9 | - Arrange the resulting data frame in descending order of `Bonus`. 10 | 11 | --- 12 | 13 | ### 2. **Merging and Joining Data Frames** 14 | Given two data frames: 15 | ```R 16 | df1 <- data.frame(ID = c(1, 2, 3), Name = c("Alice", "Bob", "Charlie"), Age = c(25, 30, 35)) 17 | df2 <- data.frame(ID = c(2, 3, 4), Department = c("HR", "IT", "Finance"), Salary = c(70000, 80000, 90000)) 18 | ``` 19 | - Perform an **inner join**, **left join**, and **full join** on `ID`. 20 | - Add a column `Age_Group` to the joined data frame using the following conditions: 21 | - `< 30`: "Young" 22 | - `30-40`: "Mid" 23 | - `> 40`: "Senior" 24 | 25 | --- 26 | 27 | ### 3. **Group Summaries and Aggregation** 28 | Using a data frame `employees` with columns `EmployeeID`, `Department`, `Salary`, and `Experience` (in years): 29 | - Calculate the average `Salary` and `Experience` per `Department`. 30 | - Identify the department with the highest average salary. 31 | - Add a column `Experience_Level` categorizing employees as `"Junior"` (<5 years), `"Mid-Level"` (5-10 years), or `"Senior"` (>10 years). 32 | 33 | --- 34 | 35 | ### 4. **Data Manipulation with apply Functions** 36 | You have a data frame `grades` where each row corresponds to a student, and columns `Math`, `Science`, `English` contain their scores: 37 | - Use `apply()` to calculate the average score for each student. 38 | - Identify students who scored below 40 in any subject and return their names. 39 | - Create a new column `Status` where students are marked as `"Pass"` if their average score is ≥50; otherwise `"Fail"`. 40 | 41 | --- 42 | 43 | ### 5. **Handling Missing Data and Imputation** 44 | You are given a data frame `sales` with columns `Region`, `Sales`, and `Profit`. Some values in `Sales` and `Profit` are missing (`NA`). 45 | - Identify the rows with missing data. 46 | - Replace missing values in `Sales` with the median of the column and in `Profit` with the mean of the column. 47 | - Create a summary showing total `Sales` and `Profit` per `Region` after imputation. 48 | 49 | --- 50 | 51 | -------------------------------------------------------------------------------- /20 Day R Data Frame /Practice_Questions.r: -------------------------------------------------------------------------------- 1 | # Task 1: Create a data frame with 10 employees' details 2 | employees <- data.frame( 3 | Name = c("John", "Emma", "Liam", "Olivia", "Sophia", "Noah", "Ava", "James", "Isabella", "Lucas"), 4 | Salary = c(45000, 60000, 52000, 48000, 75000, 47000, 82000, 58000, 68000, 73000), 5 | Department = c("IT", "HR", "Finance", "IT", "Marketing", "Finance", "HR", "Marketing", "IT", "Finance") 6 | ) 7 | 8 | # Display employees with a salary greater than 50,000 9 | high_salary_employees <- employees[employees$Salary > 50000, ] 10 | print("Employees with salary greater than 50,000:") 11 | print(high_salary_employees) 12 | 13 | # Task 2: Add a new column for performance ratings 14 | employees$Performance_Rating <- c(3, 4, 5, 2, 5, 3, 4, 5, 4, 5) 15 | print("Updated data frame with performance ratings:") 16 | print(employees) 17 | 18 | # Task 3: Merge two data frames based on product IDs 19 | # Creating two data frames for products and sales 20 | products <- data.frame( 21 | Product_ID = c(101, 102, 103, 104, 105), 22 | Product_Name = c("Laptop", "Mouse", "Keyboard", "Monitor", "Printer") 23 | ) 24 | 25 | sales <- data.frame( 26 | Product_ID = c(103, 101, 105, 102, 104), 27 | Units_Sold = c(15, 10, 5, 20, 8) 28 | ) 29 | 30 | merged_data <- merge(products, sales, by = "Product_ID") 31 | print("Merged data frame:") 32 | print(merged_data) 33 | 34 | # Task 4: Sort a data frame containing weather data by temperature in ascending order 35 | weather <- data.frame( 36 | City = c("New York", "Los Angeles", "Chicago", "Houston", "Phoenix"), 37 | Temperature = c(55, 75, 48, 68, 85) 38 | ) 39 | 40 | sorted_weather <- weather[order(weather$Temperature), ] 41 | print("Weather data sorted by temperature:") 42 | print(sorted_weather) 43 | -------------------------------------------------------------------------------- /20 Day R Data Frame /README.md: -------------------------------------------------------------------------------- 1 | # Data frames 2 | ### Data frames is a fundamental part of data manipulation and analysis in R. This tutorial will guide you through the basics of creating, inspecting, and manipulating data frames in R. 3 | 4 | ### What is a Data Frame? 5 | 6 | A data frame is a table-like structure in R, where each column can contain different types of data (numeric, character, factor, etc.). It is similar to a spreadsheet or SQL table. 7 | 8 | ### Creating a Data Frame 9 | 10 | #### Method 1: Using `data.frame()` 11 | 12 | ```r 13 | # Create vectors 14 | names <- c("Alice", "Bob", "Charlie") 15 | ages <- c(25, 30, 35) 16 | genders <- c("Female", "Male", "Male") 17 | 18 | # Create a data frame 19 | df <- data.frame(Name = names, Age = ages, Gender = genders) 20 | 21 | # Print the data frame 22 | print(df) 23 | ``` 24 | 25 | #### Method 2: Using `tibble` from the `tibble` package 26 | 27 | ```r 28 | # Install tibble package if not already installed 29 | install.packages("tibble") 30 | 31 | # Load the tibble package 32 | library(tibble) 33 | 34 | # Create a tibble 35 | df_tibble <- tibble(Name = names, Age = ages, Gender = genders) 36 | 37 | # Print the tibble 38 | print(df_tibble) 39 | ``` 40 | 41 | ### Inspecting a Data Frame 42 | 43 | #### View the Structure 44 | 45 | ```r 46 | # View the structure of the data frame 47 | str(df) 48 | ``` 49 | 50 | #### Summary Statistics 51 | 52 | ```r 53 | # Get summary statistics 54 | summary(df) 55 | ``` 56 | 57 | #### View the First and Last Few Rows 58 | 59 | ```r 60 | # View the first few rows 61 | head(df) 62 | 63 | # View the last few rows 64 | tail(df) 65 | ``` 66 | 67 | ### Accessing Data Frame Elements 68 | 69 | #### By Column Name 70 | 71 | ```r 72 | # Access a single column 73 | df$Name 74 | 75 | # Access multiple columns 76 | df[, c("Name", "Age")] 77 | ``` 78 | 79 | #### By Row and Column Indices 80 | 81 | ```r 82 | # Access a single element 83 | df[1, 2] 84 | 85 | # Access a single row 86 | df[1, ] 87 | 88 | # Access a single column 89 | df[, 2] 90 | ``` 91 | 92 | ### Adding and Removing Columns 93 | 94 | #### Adding a New Column 95 | 96 | ```r 97 | # Add a new column 98 | df$Height <- c(160, 175, 180) 99 | 100 | # Print the updated data frame 101 | print(df) 102 | ``` 103 | 104 | #### Removing a Column 105 | 106 | ```r 107 | # Remove a column 108 | df$Height <- NULL 109 | 110 | # Print the updated data frame 111 | print(df) 112 | ``` 113 | 114 | ### Filtering Data 115 | 116 | #### Using Logical Conditions 117 | 118 | ```r 119 | # Filter rows where Age is greater than 28 120 | df_filtered <- df[df$Age > 28, ] 121 | 122 | # Print the filtered data frame 123 | print(df_filtered) 124 | ``` 125 | 126 | ### Sorting Data 127 | 128 | ```r 129 | # Sort the data frame by Age 130 | df_sorted <- df[order(df$Age), ] 131 | 132 | # Print the sorted data frame 133 | print(df_sorted) 134 | ``` 135 | 136 | ### Merging Data Frames 137 | 138 | #### Using `merge()` 139 | 140 | ```r 141 | # Create another data frame 142 | df2 <- data.frame(Name = c("Alice", "Bob", "David"), Salary = c(50000, 55000, 60000)) 143 | 144 | # Merge data frames 145 | df_merged <- merge(df, df2, by = "Name", all = TRUE) 146 | 147 | # Print the merged data frame 148 | print(df_merged) 149 | ``` 150 | 151 | ### Handling Missing Values 152 | 153 | #### Identify Missing Values 154 | 155 | ```r 156 | # Check for missing values 157 | is.na(df) 158 | ``` 159 | 160 | #### Remove Rows with Missing Values 161 | 162 | ```r 163 | # Remove rows with any missing values 164 | df_no_na <- na.omit(df) 165 | 166 | # Print the data frame without missing values 167 | print(df_no_na) 168 | ``` 169 | 170 | #### Replace Missing Values 171 | 172 | ```r 173 | # Replace NA with a specific value 174 | df[is.na(df)] <- 0 175 | 176 | # Print the data frame with replaced values 177 | print(df) 178 | ``` 179 | 180 | ### Conclusion 181 | 182 | In this tutorial, we covered the basics of creating, inspecting, and manipulating data frames in R. This includes creating data frames using `data.frame()` and `tibble`, accessing and modifying data frame elements, filtering and sorting data, merging data frames, and handling missing values. Data frames are a powerful tool in R, allowing for flexible and efficient data manipulation, making them essential for data analysis and statistical modeling. 183 | -------------------------------------------------------------------------------- /20 Day R Data Frame /dataframe.r: -------------------------------------------------------------------------------- 1 | # Creating a data frame 2 | student_data <- data.frame( 3 | Name = c("Alice", "Bob", "Charlie", "David"), 4 | Age = c(25, 22, 24, 23), 5 | Grade = c("A", "B", "A", "C"), 6 | Passed = c(TRUE, TRUE, TRUE, FALSE) 7 | ) 8 | 9 | # Accessing elements of the data frame 10 | first_name <- student_data$Name[1] # Accessing the first student's name 11 | third_age <- student_data$Age[3] # Accessing the age of the third student 12 | 13 | # Displaying the original data frame 14 | cat("Original Data Frame:\n") 15 | print(student_data) 16 | 17 | # Modifying the data frame 18 | student_data$Grade[4] <- "B" # Modifying the grade of the fourth student 19 | 20 | # Adding a new column 21 | student_data$City <- c("New York", "Los Angeles", "Chicago", "Houston") 22 | 23 | # Displaying the modified data frame 24 | cat("\nModified Data Frame:\n") 25 | print(student_data) 26 | 27 | # Data frame functions 28 | num_rows <- nrow(student_data) 29 | num_cols <- ncol(student_data) 30 | column_names <- names(student_data) 31 | data_summary <- summary(student_data) 32 | 33 | # Output 34 | cat("\nNumber of Rows:", num_rows, "\n") 35 | cat("Number of Columns:", num_cols, "\n") 36 | cat("Column Names:", column_names, "\n") 37 | cat("\nData Summary:\n") 38 | print(data_summary) 39 | -------------------------------------------------------------------------------- /21 Day R Factors /README.md: -------------------------------------------------------------------------------- 1 | # **Tutorial: R Factors Explained with "Codes with Pankaj"** 2 | 3 | Welcome to **Codes with Pankaj**! In this tutorial, we’ll explore **Factors in R**, an essential data type for handling categorical data. Factors are particularly useful in data analysis and modeling. 4 | 5 | --- 6 | 7 | ## **What is an R Factor?** 8 | 9 | A **factor** in R is a data type used to handle **categorical data**. Categorical data represents values that belong to a limited number of categories, such as gender, colors, or ratings. 10 | 11 | ### **Why Use Factors?** 12 | - Factors help save memory by storing categories efficiently. 13 | - They are critical for statistical modeling in R (e.g., regression, ANOVA). 14 | - They allow easy labeling and ordering of categorical data. 15 | 16 | --- 17 | 18 | ## **Step-by-Step Guide** 19 | 20 | ### **1. Creating a Factor** 21 | 22 | You can create a factor using the `factor()` function. 23 | 24 | #### **Syntax:** 25 | ```R 26 | factor(x, levels, labels, ordered) 27 | ``` 28 | 29 | - **x**: Vector of data. 30 | - **levels**: Unique categories (optional). 31 | - **labels**: Custom labels for the categories (optional). 32 | - **ordered**: Logical value indicating if the factor is ordered (default is `FALSE`). 33 | 34 | #### **Example:** 35 | ```R 36 | # Creating a simple factor 37 | colors <- factor(c("Red", "Blue", "Green", "Red", "Blue")) 38 | print(colors) 39 | ``` 40 | 41 | #### **Output:** 42 | ``` 43 | [1] Red Blue Green Red Blue 44 | Levels: Blue Green Red 45 | ``` 46 | 47 | --- 48 | 49 | ### **2. Levels in a Factor** 50 | 51 | Levels represent the unique categories in a factor. 52 | 53 | #### **Example:** 54 | ```R 55 | print(levels(colors)) 56 | # Output: [1] "Blue" "Green" "Red" 57 | ``` 58 | 59 | #### **Modifying Levels:** 60 | You can rename or reorder levels. 61 | ```R 62 | levels(colors) <- c("Azure", "Emerald", "Crimson") 63 | print(colors) 64 | # Output: [1] Crimson Azure Emerald Crimson Azure 65 | # Levels: Azure Emerald Crimson 66 | ``` 67 | 68 | --- 69 | 70 | ### **3. Checking the Structure of a Factor** 71 | 72 | Use the `str()` function to examine the structure of a factor. 73 | 74 | #### **Example:** 75 | ```R 76 | str(colors) 77 | # Output: 78 | # Factor w/ 3 levels "Azure","Emerald",..: 3 1 2 3 1 79 | ``` 80 | 81 | --- 82 | 83 | ### **4. Ordered Factors** 84 | 85 | By default, factors are unordered. Use `ordered = TRUE` to create an **ordered factor**. 86 | 87 | #### **Example:** 88 | ```R 89 | ratings <- factor( 90 | c("Low", "Medium", "High", "Low"), 91 | levels = c("Low", "Medium", "High"), 92 | ordered = TRUE 93 | ) 94 | print(ratings) 95 | # Output: [1] Low Medium High Low 96 | # Levels: Low < Medium < High 97 | ``` 98 | 99 | #### **Use Case:** 100 | Ordered factors allow comparisons: 101 | ```R 102 | print(ratings[1] < ratings[2]) # Output: TRUE 103 | ``` 104 | 105 | --- 106 | 107 | ### **5. Converting a Factor** 108 | 109 | #### Convert Factor to Character: 110 | ```R 111 | char_colors <- as.character(colors) 112 | print(char_colors) 113 | # Output: [1] "Crimson" "Azure" "Emerald" "Crimson" "Azure" 114 | ``` 115 | 116 | #### Convert Factor to Numeric: 117 | When converting to numeric, use `as.numeric(as.character())` to avoid issues. 118 | ```R 119 | num_ratings <- as.numeric(as.character(ratings)) 120 | # This works when the factor contains numbers as characters. 121 | ``` 122 | 123 | --- 124 | 125 | ### **6. Summary of a Factor** 126 | 127 | The `summary()` function gives a count of each level in the factor. 128 | 129 | #### **Example:** 130 | ```R 131 | summary(colors) 132 | # Output: 133 | # Azure Emerald Crimson 134 | # 2 1 2 135 | ``` 136 | 137 | --- 138 | 139 | ### **7. Combining Factors** 140 | 141 | When combining two factors, their levels are merged. 142 | 143 | #### **Example:** 144 | ```R 145 | factor1 <- factor(c("A", "B")) 146 | factor2 <- factor(c("B", "C")) 147 | combined <- factor(c(as.character(factor1), as.character(factor2))) 148 | print(combined) 149 | # Output: [1] "A" "B" "B" "C" 150 | ``` 151 | 152 | --- 153 | 154 | ### **8. Factors in Data Frames** 155 | 156 | Factors are often used in data frames to represent categorical columns. 157 | 158 | #### **Example:** 159 | ```R 160 | data <- data.frame( 161 | Name = c("Alice", "Bob", "Charlie"), 162 | Gender = factor(c("Female", "Male", "Male")) 163 | ) 164 | print(data) 165 | # Output: 166 | # Name Gender 167 | # 1 Alice Female 168 | # 2 Bob Male 169 | # 3 Charlie Male 170 | ``` 171 | 172 | #### **Check Levels in a Column:** 173 | ```R 174 | print(levels(data$Gender)) 175 | # Output: [1] "Female" "Male" 176 | ``` 177 | 178 | --- 179 | 180 | ### **9. Dropping Unused Levels** 181 | 182 | After subsetting a factor, unused levels might remain. Use `droplevels()` to remove them. 183 | 184 | #### **Example:** 185 | ```R 186 | subset <- colors[1:2] 187 | print(subset) 188 | # Output: [1] Crimson Azure 189 | # Levels: Azure Emerald Crimson 190 | 191 | clean_subset <- droplevels(subset) 192 | print(clean_subset) 193 | # Output: [1] Crimson Azure 194 | # Levels: Azure Crimson 195 | ``` 196 | 197 | --- 198 | 199 | ### **10. Factor Pitfalls and Tips** 200 | 201 | - **Pitfall 1**: Automatic conversion to factors in data frames. 202 | ```R 203 | df <- data.frame(name = c("Pankaj", "Ravi")) 204 | print(str(df)) # `name` may automatically become a factor. 205 | ``` 206 | **Solution**: Use `stringsAsFactors = FALSE` when creating a data frame. 207 | 208 | - **Pitfall 2**: Converting factors to numeric directly. 209 | ```R 210 | num_factor <- factor(c(5, 10, 15)) 211 | print(as.numeric(num_factor)) # Incorrect conversion 212 | ``` 213 | 214 | --- 215 | 216 | ## **Summary** 217 | 218 | | Task | Function/Method | 219 | |-----------------------------|----------------------------| 220 | | Create a factor | `factor()` | 221 | | Check levels | `levels()` | 222 | | Rename/reorder levels | Modify `levels()` | 223 | | Ordered factors | `ordered = TRUE` | 224 | | Convert factor to character | `as.character()` | 225 | | Convert factor to numeric | `as.numeric(as.character())` | 226 | | Drop unused levels | `droplevels()` | 227 | 228 | --- 229 | 230 | ## **Your Turn!** 231 | 232 | Experiment with factors by creating your own categorical data. Use ordered factors to analyze rankings or ratings, and combine them with data frames for deeper insights. 233 | 234 | Stay tuned for more tutorials with **Codes with Pankaj**! 🚀 235 | -------------------------------------------------------------------------------- /21 Day R Factors /factors.r: -------------------------------------------------------------------------------- 1 | # Creating a factor for car colors 2 | car_colors <- c("Red", "Blue", "Green", "Red", "Blue") 3 | factor_colors <- factor(car_colors) 4 | 5 | # Viewing levels of the factor 6 | color_levels <- levels(factor_colors) 7 | 8 | # Displaying the levels 9 | cat("Levels of the factor_colors:\n") 10 | print(color_levels) 11 | 12 | # Creating an ordered factor for education levels 13 | education_levels <- c("High School", "Bachelor's", "Master's", "Ph.D.") 14 | ordered_education <- factor(education_levels, 15 | levels = c("High School", "Bachelor's", "Master's", "Ph.D."), 16 | ordered = TRUE) 17 | 18 | # Displaying the ordered factor 19 | cat("\nOrdered Education Levels:\n") 20 | print(ordered_education) 21 | 22 | # Changing levels of a factor 23 | new_levels <- c("Low", "Medium", "High") 24 | levels(factor_colors) <- new_levels 25 | 26 | # Displaying the factor with new levels 27 | cat("\nFactor with New Levels:\n") 28 | print(factor_colors) 29 | 30 | # Creating a factor for survey responses 31 | responses <- c("Agree", "Disagree", "Neutral", "Agree", "Strongly Disagree") 32 | factor_responses <- factor(responses) 33 | 34 | # Displaying a table of frequencies 35 | cat("\nFrequency Table for Factor Responses:\n") 36 | print(table(factor_responses)) 37 | 38 | # Summary statistics for the factor 39 | cat("\nSummary of Factor Responses:\n") 40 | print(summary(factor_responses)) 41 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Bar Plot/Example.md: -------------------------------------------------------------------------------- 1 | # Reading Excel Files and Creating Bar Plots in R** 2 | 3 | ## **Prerequisites** 4 | Before starting, ensure the required R packages are installed. Run the following code to install `readxl` and `ggplot2` if not already installed: 5 | 6 | ```R 7 | install.packages(c("readxl", "ggplot2")) 8 | ``` 9 | 10 | --- 11 | 12 | ## **Step 1: Load the Necessary Libraries** 13 | We need the following libraries: 14 | - **`readxl`**: For reading Excel files. 15 | - **`ggplot2`**: For creating sophisticated plots. 16 | 17 | ```R 18 | library(readxl) # To handle Excel files 19 | library(ggplot2) # For advanced plotting 20 | ``` 21 | 22 | --- 23 | 24 | ## **Step 2: Reading the Excel File** 25 | ### **Method 1: Basic Reading** 26 | Assume you have a file named `sales_data.xlsx` with columns **Product**, **Sales**, and **Region**. 27 | 28 | ```R 29 | sales_data <- read_excel("sales_data.xlsx") 30 | ``` 31 | 32 | ### **Method 2: Reading a Specific Sheet** 33 | To load a specific sheet from the Excel file: 34 | 35 | ```R 36 | sales_data <- read_excel("sales_data.xlsx", sheet = "Sheet1") 37 | ``` 38 | 39 | ### **Optional: Specify Column Types** 40 | If column types are ambiguous, specify them explicitly: 41 | 42 | ```R 43 | sales_data <- read_excel("sales_data.xlsx", col_types = c("text", "numeric", "text")) 44 | ``` 45 | 46 | --- 47 | 48 | ## **Step 3: Verify the Data** 49 | Before plotting, always inspect the data to ensure it is loaded correctly. 50 | 51 | ### **View First Few Rows** 52 | ```R 53 | print(head(sales_data)) 54 | ``` 55 | 56 | ### **Check the Structure** 57 | ```R 58 | str(sales_data) 59 | ``` 60 | 61 | ### **Summary Statistics** 62 | ```R 63 | summary(sales_data) 64 | ``` 65 | 66 | --- 67 | 68 | ## **Step 4: Data Cleaning and Preparation** 69 | ### **Check for Missing Values** 70 | ```R 71 | any(is.na(sales_data)) # Returns TRUE if missing values exist 72 | ``` 73 | 74 | ### **Remove Missing Values** 75 | ```R 76 | sales_data <- na.omit(sales_data) 77 | ``` 78 | 79 | ### **Aggregate Sales Data (if needed)** 80 | Aggregate total sales by product or region: 81 | 82 | ```R 83 | agg_sales <- aggregate(Sales ~ Product + Region, data = sales_data, sum) 84 | ``` 85 | 86 | --- 87 | 88 | ## **Step 5: Create a Basic Bar Plot** 89 | Visualize the total sales by product: 90 | 91 | ```R 92 | basic_bar_plot <- ggplot(sales_data, aes(x = Product, y = Sales)) + 93 | geom_bar(stat = "identity", fill = "blue") + 94 | labs(title = "Sales by Product", 95 | x = "Product", 96 | y = "Total Sales") + 97 | theme_minimal() 98 | 99 | print(basic_bar_plot) 100 | ``` 101 | 102 | --- 103 | 104 | ## **Step 6: Create Advanced Bar Plots** 105 | ### **Grouped Bar Plot by Region** 106 | ```R 107 | grouped_bar_plot <- ggplot(sales_data, aes(x = Product, y = Sales, fill = Region)) + 108 | geom_bar(stat = "identity", position = "dodge") + 109 | labs(title = "Sales by Product and Region", 110 | x = "Product", 111 | y = "Total Sales", 112 | fill = "Region") + 113 | theme_minimal() + 114 | scale_fill_brewer(palette = "Set1") 115 | 116 | print(grouped_bar_plot) 117 | ``` 118 | 119 | ### **Stacked Bar Plot by Region** 120 | ```R 121 | stacked_bar_plot <- ggplot(sales_data, aes(x = Product, y = Sales, fill = Region)) + 122 | geom_bar(stat = "identity", position = "stack") + 123 | labs(title = "Stacked Sales by Product and Region", 124 | x = "Product", 125 | y = "Total Sales", 126 | fill = "Region") + 127 | theme_minimal() + 128 | scale_fill_brewer(palette = "Set2") 129 | 130 | print(stacked_bar_plot) 131 | ``` 132 | 133 | --- 134 | 135 | ## **Step 7: Save the Plots** 136 | ### **Save Plots as PNG** 137 | Save the plots to your working directory: 138 | 139 | ```R 140 | ggsave("basic_sales_barplot.png", basic_bar_plot, width = 10, height = 6) 141 | ggsave("grouped_sales_barplot.png", grouped_bar_plot, width = 10, height = 6) 142 | ggsave("stacked_sales_barplot.png", stacked_bar_plot, width = 10, height = 6) 143 | ``` 144 | 145 | --- 146 | 147 | ## **Step 8: Adding More Customizations** 148 | ### **Rotate X-Axis Labels** 149 | ```R 150 | grouped_bar_plot <- grouped_bar_plot + 151 | theme(axis.text.x = element_text(angle = 45, hjust = 1)) 152 | print(grouped_bar_plot) 153 | ``` 154 | 155 | ### **Add Annotations** 156 | Annotate bars with values: 157 | 158 | ```R 159 | annotated_plot <- grouped_bar_plot + 160 | geom_text(aes(label = Sales), position = position_dodge(width = 0.9), vjust = -0.5) 161 | print(annotated_plot) 162 | ``` 163 | 164 | ### **Customize Themes** 165 | Use predefined themes like `theme_classic` or create custom themes: 166 | 167 | ```R 168 | custom_plot <- grouped_bar_plot + theme_classic() 169 | print(custom_plot) 170 | ``` 171 | 172 | --- 173 | 174 | ## **Troubleshooting Tips** 175 | 1. **Check the Working Directory**: Use `getwd()` to verify your current directory. 176 | 2. **Ensure File Exists**: Use `file.exists("sales_data.xlsx")` to confirm the file is present. 177 | 3. **Column Name Matching**: Ensure column names in the Excel file match exactly, including capitalization. 178 | 4. **Install Missing Dependencies**: Use `install.packages()` to install any missing libraries. 179 | 180 | --- 181 | 182 | ## **Example Excel File Structure** 183 | | Product | Sales | Region | 184 | |---------|-------|--------| 185 | | Laptop | 5000 | North | 186 | | Phone | 3000 | South | 187 | | Tablet | 4000 | East | 188 | | Laptop | 6000 | West | 189 | 190 | This tutorial covers basic and advanced plotting techniques, data preparation, and plot customization to make your R visualization tasks comprehensive and effective! 191 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Bar Plot/R bar Plot.md: -------------------------------------------------------------------------------- 1 | ### Bar plots in R is a fundamental part of data visualization. Bar plots can be created using various functions and packages in R, such as `barplot()` from base R and `ggplot()` from the `ggplot2` package. This tutorial will guide you through the steps to create and customize bar plots using both methods. 2 | 3 | ### Using `barplot()` in Base R 4 | 5 | #### Step 1: Basic Bar Plot 6 | 7 | First, we will create a simple bar plot using the `barplot()` function from base R. 8 | 9 | ```r 10 | # Create a vector of data 11 | counts <- c(23, 17, 35, 29) 12 | 13 | # Create a bar plot 14 | barplot(counts) 15 | ``` 16 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/0d121ef6-3898-4270-9ab1-587b67f50c54) 17 | 18 | #### Step 2: Adding Labels and Titles 19 | 20 | You can add labels and titles to make the plot more informative. 21 | 22 | ```r 23 | # Create a bar plot with labels and a title 24 | barplot(counts, 25 | main = "Basic Bar Plot", 26 | xlab = "Categories", 27 | ylab = "Counts", 28 | names.arg = c("A", "B", "C", "D")) 29 | ``` 30 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/2d4d4069-b74c-456e-ac71-7b3ec4061641) 31 | 32 | #### Step 3: Customizing Colors 33 | 34 | Colors can be customized using the `col` parameter. 35 | 36 | ```r 37 | # Create a bar plot with custom colors 38 | barplot(counts, 39 | main = "Bar Plot with Colors", 40 | xlab = "Categories", 41 | ylab = "Counts", 42 | names.arg = c("A", "B", "C", "D"), 43 | col = c("red", "blue", "green", "purple")) 44 | ``` 45 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/e16cb065-ed4f-4a84-980e-c86b3d1ce12d) 46 | 47 | #### Step 4: Adding Grid Lines 48 | 49 | Grid lines can be added to improve readability. 50 | 51 | ```r 52 | # Create a bar plot with grid lines 53 | barplot(counts, 54 | main = "Bar Plot with Grid Lines", 55 | xlab = "Categories", 56 | ylab = "Counts", 57 | names.arg = c("A", "B", "C", "D"), 58 | col = c("red", "blue", "green", "purple")) 59 | 60 | # Add grid lines 61 | grid(nx = NA, ny = NULL) 62 | ``` 63 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/844fc48d-3af3-4ec1-bc27-f17edd93ef64) 64 | 65 | #### Step 5: Horizontal Bar Plot 66 | 67 | Bar plots can also be horizontal by setting the `horiz` parameter to `TRUE`. 68 | 69 | ```r 70 | # Create a horizontal bar plot 71 | barplot(counts, 72 | main = "Horizontal Bar Plot", 73 | xlab = "Counts", 74 | ylab = "Categories", 75 | names.arg = c("A", "B", "C", "D"), 76 | col = c("red", "blue", "green", "purple"), 77 | horiz = TRUE) 78 | ``` 79 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/b2547fc4-5b65-4206-9c1b-d9f7b5c8ac57) 80 | 81 | ### Using `ggplot2` for Bar Plots 82 | 83 | The `ggplot2` package provides a more flexible and powerful way to create bar plots. 84 | 85 | #### Step 1: Install and Load `ggplot2` 86 | 87 | If you haven't installed `ggplot2` yet, you can do so by running: 88 | 89 | ```r 90 | install.packages("ggplot2") 91 | ``` 92 | 93 | Then, load the package: 94 | 95 | ```r 96 | library(ggplot2) 97 | ``` 98 | 99 | #### Step 2: Basic Bar Plot with `ggplot2` 100 | 101 | Create a simple bar plot using `ggplot2`. 102 | 103 | ```r 104 | # Create a data frame 105 | df <- data.frame( 106 | category = c("A", "B", "C", "D"), 107 | counts = c(23, 17, 35, 29) 108 | ) 109 | 110 | # Create a basic bar plot 111 | ggplot(df, aes(x = category, y = counts)) + 112 | geom_bar(stat = "identity") 113 | ``` 114 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/dadaaee8-cdb7-4ed5-9e5c-18bdae5f2a91) 115 | 116 | #### Step 3: Adding Labels and Titles 117 | 118 | You can add labels and titles using the `labs()` function. 119 | 120 | ```r 121 | # Create a bar plot with labels and a title 122 | ggplot(df, aes(x = category, y = counts)) + 123 | geom_bar(stat = "identity") + 124 | labs(title = "Basic Bar Plot", x = "Categories", y = "Counts") 125 | ``` 126 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/5e344d16-f0db-4dfb-ba6e-0c6cb93dd416) 127 | 128 | #### Step 4: Customizing Colors 129 | 130 | Colors can be customized using the `fill` aesthetic and the `scale_fill_manual()` function. 131 | 132 | ```r 133 | # Create a bar plot with custom colors 134 | ggplot(df, aes(x = category, y = counts, fill = category)) + 135 | geom_bar(stat = "identity") + 136 | labs(title = "Bar Plot with Colors", x = "Categories", y = "Counts") + 137 | scale_fill_manual(values = c("red", "blue", "green", "purple")) 138 | ``` 139 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/19f5cbfd-d198-4226-a64e-ac794d459f44) 140 | 141 | 142 | #### Step 5: Adding Grid Lines and Themes 143 | 144 | Themes can be used to customize the appearance of the plot. 145 | 146 | ```r 147 | # Create a bar plot with a custom theme 148 | ggplot(df, aes(x = category, y = counts, fill = category)) + 149 | geom_bar(stat = "identity") + 150 | labs(title = "Bar Plot with Custom Theme", x = "Categories", y = "Counts") + 151 | scale_fill_manual(values = c("red", "blue", "green", "purple")) + 152 | theme_minimal() 153 | 154 | ``` 155 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/96f4c016-fcd8-4a0c-af24-2ce99ed7658a) 156 | 157 | 158 | #### Step 6: Horizontal Bar Plot 159 | 160 | Create a horizontal bar plot by using `coord_flip()`. 161 | 162 | ```r 163 | # Create a horizontal bar plot 164 | ggplot(df, aes(x = category, y = counts, fill = category)) + 165 | geom_bar(stat = "identity") + 166 | labs(title = "Horizontal Bar Plot", x = "Categories", y = "Counts") + 167 | scale_fill_manual(values = c("red", "blue", "green", "purple")) + 168 | coord_flip() 169 | ``` 170 | 171 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/d86a5699-c15c-4562-8b86-292b9b43062b) 172 | 173 | 174 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Bar Plot/README.md: -------------------------------------------------------------------------------- 1 | # R Bar Plot 2 | In R, you can create a bar plot to visualize the distribution or comparison of categorical data. Bar plots are commonly used to represent counts or frequencies of categories within a dataset. Here's how you can create a simple bar plot using the `barplot()` function: 3 | 4 | Creating a bar plot in R is a straightforward process, and you can customize it according to your needs. Below, I'll provide examples for each of the tasks you mentioned: 5 | 6 | ``` 7 | Create Bar Plot in R 8 | Add Title to a Bar Plot in R 9 | Provide Labels to Axes in R 10 | Provide Names for Each Bar of Bar Plot in R 11 | Change Bar Color in R 12 | Bar Texture in R 13 | Make Bar Plot Horizontal in R 14 | Stacked Bar Plot in R 15 | 16 | ``` 17 | 18 | 19 | **1. Create a Bar Plot in R:** 20 | 21 | Here's how to create a basic bar plot in R: 22 | 23 | ```R 24 | # Sample data for a bar plot 25 | categories <- c("Category A", "Category B", "Category C", "Category D") 26 | counts <- c(10, 25, 15, 30) 27 | 28 | # Create a bar plot 29 | barplot(counts, names.arg = categories, col = "red", 30 | main = "Bar Plot Example", xlab = "Categories", ylab = "Counts") 31 | ``` 32 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/438a32b2-d759-4832-9ed4-ca5c40303059) 33 | 34 | 35 | **2. Add a Title to a Bar Plot:** 36 | 37 | You can add a title to your bar plot using the `main` parameter, as shown in the previous example. 38 | 39 | **3. Provide Labels to Axes:** 40 | 41 | To add labels to the x and y axes, you can use the `xlab` and `ylab` parameters, respectively, as shown in the previous example. 42 | 43 | **4. Provide Names for Each Bar:** 44 | 45 | The names for each bar are added using the `names.arg` parameter, as shown in the previous example. 46 | 47 | **5. Change Bar Color:** 48 | 49 | You can change the color of the bars by specifying the `col` parameter. Here's an example with a different bar color: 50 | 51 | ```R 52 | barplot(counts, names.arg = categories, col = "lightgreen", 53 | main = "Bar Plot Example", xlab = "Categories", ylab = "Counts") 54 | ``` 55 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/1c929f2e-8623-423a-9043-69ae4a412719) 56 | 57 | 58 | **6. Bar Texture:** 59 | 60 | To add texture to the bars, you can use the `density` parameter: 61 | 62 | ```R 63 | barplot(counts, names.arg = categories, col = "black", 64 | main = "Bar Plot with Texture", xlab = "Categories", ylab = "Counts", 65 | density = c(10, 20, 30, 40)) 66 | ``` 67 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/313622db-25ab-4c03-be41-6c8eb12c2b49) 68 | 69 | **7. Make Bar Plot Horizontal:** 70 | 71 | To create a horizontal bar plot, you can use the `horiz` parameter: 72 | 73 | ```R 74 | barplot(counts, names.arg = categories, col = "lightcoral", 75 | main = "Horizontal Bar Plot", xlab = "Counts", ylab = "Categories", 76 | horiz = TRUE) 77 | ``` 78 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/b671e120-79e8-4f6e-a4e5-31ed7186765c) 79 | 80 | **8. Stacked Bar Plot:** 81 | 82 | A stacked bar plot can be created using the `beside` parameter. Here's an example: 83 | 84 | ```R 85 | # Data for stacked bar plot 86 | data <- matrix(c(10, 5, 15, 10, 25, 10, 30, 15), nrow = 4, ncol = 4, byrow = TRUE) 87 | colnames(data) <- categories 88 | 89 | # Create a stacked bar plot 90 | barplot(data, beside = TRUE, col = c("red", "blue", "green", "purple"), 91 | main = "Stacked Bar Plot Example", xlab = "Categories", ylab = "Counts", 92 | legend.text = TRUE) 93 | ``` 94 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/cb7247bf-c619-435d-8d7c-e25285a6d2b5) 95 | 96 | 97 | These examples cover various aspects of creating and customizing bar plots in R. You can adjust the parameters to meet your specific needs and preferences. 98 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Boxplot /Example.md: -------------------------------------------------------------------------------- 1 | ## **How to Create a Boxplot in R Programming** 2 | 3 | A **Boxplot** is a powerful visualization tool that provides a summary of the distribution of a dataset. It shows the median, quartiles, and outliers effectively. In this tutorial, we’ll go through the basics of creating and customizing boxplots in R. 4 | 5 | --- 6 | 7 | ### **What You Will Learn:** 8 | 1. What is a Boxplot? 9 | 2. Components of a Boxplot. 10 | 3. How to Create a Boxplot in R. 11 | 4. Customizing Boxplots (Adding Colors, Labels, etc.). 12 | 5. Using Boxplots with Grouped Data. 13 | 6. Advanced Customization Examples. 14 | 15 | --- 16 | 17 | ### **1. What is a Boxplot?** 18 | 19 | A **Boxplot** (or Box-and-Whisker Plot) is a standardized way of displaying the distribution of data based on: 20 | - Minimum value 21 | - First quartile (Q1) 22 | - Median 23 | - Third quartile (Q3) 24 | - Maximum value 25 | - Outliers (values outside 1.5 times the interquartile range) 26 | 27 | Boxplots are great for comparing distributions and identifying outliers in datasets. 28 | 29 | --- 30 | 31 | ### **2. Components of a Boxplot** 32 | Before creating a boxplot, let's understand its key components: 33 | - **Box**: Represents the interquartile range (IQR) (Q1 to Q3). 34 | - **Line inside the box**: The median of the data. 35 | - **Whiskers**: Extend to the smallest and largest values within 1.5 * IQR. 36 | - **Outliers**: Points beyond the whiskers. 37 | 38 | --- 39 | 40 | ### **3. How to Create a Boxplot in R** 41 | 42 | #### **Step 1: Set up your environment** 43 | First, ensure R and RStudio are installed. Load any necessary libraries (e.g., `ggplot2` for advanced plots). 44 | 45 | ```R 46 | # Basic setup 47 | # No special library needed for base boxplot function 48 | ``` 49 | 50 | #### **Step 2: Create a Basic Boxplot** 51 | 52 | Let’s create a basic boxplot using the `boxplot()` function. 53 | 54 | ```R 55 | # Example Dataset 56 | data <- c(5, 7, 8, 12, 15, 18, 20, 22, 24, 30, 35, 40, 45) 57 | 58 | # Creating a boxplot 59 | boxplot(data, main = "Basic Boxplot Example", xlab = "Dataset", ylab = "Values") 60 | ``` 61 | 62 | **Explanation:** 63 | - `boxplot(data)`: Generates the boxplot. 64 | - `main`: Adds a title to the plot. 65 | - `xlab` and `ylab`: Label the x and y axes. 66 | 67 | --- 68 | 69 | ### **4. Customizing Boxplots** 70 | 71 | You can customize the appearance of your boxplots for better readability. 72 | 73 | #### **Adding Colors** 74 | ```R 75 | # Customizing the boxplot 76 | boxplot(data, 77 | main = "Boxplot with Custom Colors", 78 | xlab = "Dataset", 79 | ylab = "Values", 80 | col = "skyblue", 81 | border = "darkblue") 82 | ``` 83 | 84 | **Explanation:** 85 | - `col`: Fills the box with the specified color. 86 | - `border`: Changes the color of the box outline. 87 | 88 | #### **Adding Notches to Show Confidence Intervals** 89 | ```R 90 | boxplot(data, 91 | main = "Boxplot with Notches", 92 | notch = TRUE, 93 | col = "orange") 94 | ``` 95 | **Explanation:** 96 | - `notch = TRUE`: Adds notches to the boxplot to visualize the confidence interval of the median. 97 | 98 | --- 99 | 100 | ### **5. Boxplot with Grouped Data** 101 | 102 | When working with grouped datasets, you can create boxplots to compare categories. 103 | 104 | ```R 105 | # Example Dataset 106 | group_data <- data.frame( 107 | Values = c(5, 7, 12, 15, 18, 20, 35, 40, 45, 10, 12, 22, 25, 28, 30), 108 | Category = rep(c("A", "B", "C"), each = 5) 109 | ) 110 | 111 | # Grouped boxplot 112 | boxplot(Values ~ Category, 113 | data = group_data, 114 | main = "Boxplot for Grouped Data", 115 | xlab = "Category", 116 | ylab = "Values", 117 | col = c("pink", "lightgreen", "skyblue")) 118 | ``` 119 | 120 | **Explanation:** 121 | - `Values ~ Category`: Formula specifying the relationship between the data and groups. 122 | - `data`: The dataset containing the values and groups. 123 | - `col`: Assigns different colors to each group. 124 | 125 | --- 126 | 127 | ### **6. Advanced Customizations with ggplot2** 128 | 129 | The `ggplot2` library offers advanced customization and aesthetics. 130 | 131 | #### **Install ggplot2 if not already installed** 132 | ```R 133 | install.packages("ggplot2") 134 | ``` 135 | 136 | #### **Create a Boxplot with ggplot2** 137 | ```R 138 | library(ggplot2) 139 | 140 | # Example Dataset 141 | group_data <- data.frame( 142 | Values = c(5, 7, 12, 15, 18, 20, 35, 40, 45, 10, 12, 22, 25, 28, 30), 143 | Category = rep(c("A", "B", "C"), each = 5) 144 | ) 145 | 146 | # Boxplot using ggplot2 147 | ggplot(group_data, aes(x = Category, y = Values, fill = Category)) + 148 | geom_boxplot() + 149 | ggtitle("Boxplot with ggplot2") + 150 | xlab("Category") + 151 | ylab("Values") + 152 | theme_minimal() 153 | ``` 154 | 155 | **Explanation:** 156 | - `aes(x = Category, y = Values, fill = Category)`: Maps the variables to the axes and assigns colors. 157 | - `geom_boxplot()`: Adds the boxplot layer. 158 | - `theme_minimal()`: Applies a clean theme to the plot. 159 | 160 | --- 161 | 162 | ### **7. Handling Outliers in Boxplots** 163 | 164 | Outliers can be identified using boxplots. To remove or highlight them: 165 | ```R 166 | # Highlighting outliers in ggplot2 167 | ggplot(group_data, aes(x = Category, y = Values)) + 168 | geom_boxplot(outlier.colour = "red", outlier.shape = 16) + 169 | ggtitle("Boxplot Highlighting Outliers") + 170 | xlab("Category") + 171 | ylab("Values") + 172 | theme_minimal() 173 | ``` 174 | 175 | --- 176 | 177 | ### **8. Saving the Plot** 178 | 179 | You can save your boxplots to your computer: 180 | ```R 181 | # Save the plot 182 | png("boxplot_example.png") 183 | boxplot(data, main = "Boxplot Example", col = "lightblue") 184 | dev.off() 185 | ``` 186 | 187 | --- 188 | 189 | ### **Conclusion** 190 | 191 | Boxplots are essential for analyzing data distributions and identifying outliers in R. Using the `boxplot()` function for simple visualizations or `ggplot2` for advanced customization makes R a flexible tool for creating boxplots. 192 | 193 | For more R programming tutorials, stay tuned to **Codes With Pankaj**! 🚀 194 | 195 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Boxplot /README.md: -------------------------------------------------------------------------------- 1 | # R Boxplot 2 | 3 | Creating and customizing box plots in R is an essential part of data visualization, especially when you need to summarize the distribution of a dataset. This tutorial will guide you through the steps to create and customize box plots using both base R and the `ggplot2` package. 4 | 5 | ### Using `boxplot()` in Base R 6 | 7 | #### Step 1: Basic Box Plot 8 | 9 | First, we will create a simple box plot using the `boxplot()` function from base R. 10 | 11 | ```r 12 | # Generate sample data 13 | data <- rnorm(100, mean = 50, sd = 10) 14 | 15 | # Create a basic box plot 16 | boxplot(data) 17 | ``` 18 | 19 | Output: 20 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/04090250-5b1b-438d-a933-dbd717f8bfb8) 21 | 22 | 23 | #### Step 2: Box Plot with Multiple Groups 24 | 25 | You can create a box plot for multiple groups. 26 | 27 | ```r 28 | # Generate sample data for multiple groups 29 | set.seed(123) 30 | group1 <- rnorm(50, mean = 50, sd = 10) 31 | group2 <- rnorm(50, mean = 60, sd = 15) 32 | group3 <- rnorm(50, mean = 55, sd = 20) 33 | 34 | # Combine the data into a data frame 35 | data <- data.frame( 36 | values = c(group1, group2, group3), 37 | group = factor(rep(c("Group 1", "Group 2", "Group 3"), each = 50)) 38 | ) 39 | 40 | # Create a box plot for multiple groups 41 | boxplot(values ~ group, data = data) 42 | ``` 43 | 44 | Output: 45 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/76401fa6-c00b-4d67-a88f-68d0af74ac6f) 46 | 47 | 48 | #### Step 3: Adding Titles and Labels 49 | 50 | You can add titles and labels to make the plot more informative. 51 | 52 | ```r 53 | # Create a box plot with titles and labels 54 | boxplot(values ~ group, data = data, 55 | main = "Box Plot of Values by Group", 56 | xlab = "Group", 57 | ylab = "Values") 58 | ``` 59 | 60 | Output: 61 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/1f462e1c-b66c-4f9f-abc0-ad1913549dda) 62 | 63 | 64 | #### Step 4: Customizing Colors 65 | 66 | Colors can be customized using the `col` parameter. 67 | 68 | ```r 69 | # Create a box plot with custom colors 70 | boxplot(values ~ group, data = data, 71 | main = "Box Plot with Custom Colors", 72 | xlab = "Group", 73 | ylab = "Values", 74 | col = c("red", "blue", "green")) 75 | ``` 76 | 77 | Output: 78 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/5a31a6e7-6743-40a6-8522-d8f7a417e19d) 79 | 80 | 81 | ### Using `ggplot2` for Box Plots 82 | 83 | The `ggplot2` package provides a more flexible and powerful way to create box plots. 84 | 85 | #### Step 1: Install and Load `ggplot2` 86 | 87 | If you haven't installed `ggplot2` yet, you can do so by running: 88 | 89 | ```r 90 | install.packages("ggplot2") 91 | ``` 92 | 93 | Then, load the package: 94 | 95 | ```r 96 | library(ggplot2) 97 | ``` 98 | 99 | #### Step 2: Basic Box Plot with `ggplot2` 100 | 101 | Create a simple box plot using `ggplot2`. 102 | 103 | ```r 104 | # Create a basic box plot 105 | ggplot(data, aes(x = group, y = values)) + 106 | geom_boxplot() 107 | ``` 108 | 109 | Output: 110 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/e7153d77-2325-4efe-8713-f82c1d38f9e1) 111 | 112 | 113 | #### Step 3: Adding Titles and Labels 114 | 115 | You can add titles and labels using the `labs()` function. 116 | 117 | ```r 118 | # Create a box plot with titles and labels 119 | ggplot(data, aes(x = group, y = values)) + 120 | geom_boxplot() + 121 | labs(title = "Box Plot of Values by Group", x = "Group", y = "Values") 122 | ``` 123 | 124 | Output: 125 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/0fa306ac-a455-4b18-9ce9-91d58f77f79a) 126 | 127 | 128 | #### Step 4: Customizing Colors 129 | 130 | Colors can be customized using the `aes()` and `scale_fill_manual()` functions. 131 | 132 | ```r 133 | # Create a box plot with custom colors 134 | ggplot(data, aes(x = group, y = values, fill = group)) + 135 | geom_boxplot() + 136 | labs(title = "Box Plot with Custom Colors", x = "Group", y = "Values") + 137 | scale_fill_manual(values = c("red", "blue", "green")) + 138 | theme_minimal() 139 | ``` 140 | 141 | Output: 142 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/63f950c5-b949-43fd-b6fb-4910d9607dd5) 143 | 144 | 145 | #### Step 5: Adding Notches 146 | 147 | You can add notches to the box plot to compare groups. 148 | 149 | ```r 150 | # Create a box plot with notches 151 | ggplot(data, aes(x = group, y = values, fill = group)) + 152 | geom_boxplot(notch = TRUE) + 153 | labs(title = "Box Plot with Notches", x = "Group", y = "Values") + 154 | scale_fill_manual(values = c("red", "blue", "green")) + 155 | theme_minimal() 156 | ``` 157 | 158 | Output: 159 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/4c1740f7-408e-4917-ba91-c69a758b1a84) 160 | 161 | 162 | #### Step 6: Horizontal Box Plot 163 | 164 | Create a horizontal box plot by using `coord_flip()`. 165 | 166 | ```r 167 | # Create a horizontal box plot 168 | ggplot(data, aes(x = group, y = values, fill = group)) + 169 | geom_boxplot() + 170 | labs(title = "Horizontal Box Plot", x = "Group", y = "Values") + 171 | scale_fill_manual(values = c("red", "blue", "green")) + 172 | coord_flip() + 173 | theme_minimal() 174 | ``` 175 | 176 | Output: 177 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/a5ce2a8c-cb19-4446-90f3-f985d8a94c7c) 178 | 179 | 180 | ### Conclusion 181 | 182 | In this tutorial, we covered the basics of creating and customizing box plots in R using both base R's `boxplot()` function and the `ggplot2` package. This included creating box plots for single and multiple groups, adding titles and labels, customizing colors, adding notches, and creating horizontal box plots. Box plots are a powerful tool for summarizing the distribution of a dataset and comparing different groups. 183 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Histogram/Example.md: -------------------------------------------------------------------------------- 1 | ## **How to Create a Histogram in R Programming – A Step-by-Step Guide** 2 | 3 | A **Histogram** is a powerful tool for visualizing the distribution of a numeric dataset. It divides the data into intervals (bins) and represents the frequency of data points in each interval. 4 | 5 | --- 6 | 7 | ### **What You Will Learn:** 8 | 1. What is a Histogram? 9 | 2. How to Create a Histogram in R (Basic Method). 10 | 3. Customizing Histograms (Colors, Titles, Axis Labels, etc.). 11 | 4. Using Histograms with Grouped Data. 12 | 5. Advanced Customizations with `ggplot2`. 13 | 6. Saving Histograms. 14 | 15 | --- 16 | 17 | ### **1. What is a Histogram?** 18 | 19 | A **Histogram** is a graphical representation of the frequency distribution of a dataset. It displays: 20 | - **Bins**: Intervals dividing the data. 21 | - **Bars**: Heights represent the frequency (or density) of data in each bin. 22 | 23 | --- 24 | 25 | ### **2. How to Create a Histogram in R (Basic Method)** 26 | 27 | R’s base function `hist()` is used to create histograms. 28 | 29 | #### **Step 1: Set up your environment** 30 | ```R 31 | # No special library is needed for the basic histogram function 32 | ``` 33 | 34 | #### **Step 2: Create a Basic Histogram** 35 | 36 | ```R 37 | # Example Dataset 38 | data <- c(5, 7, 8, 12, 15, 18, 20, 22, 24, 30, 35, 40, 45) 39 | 40 | # Basic Histogram 41 | hist(data, main = "Basic Histogram Example", xlab = "Values", ylab = "Frequency") 42 | ``` 43 | 44 | **Explanation:** 45 | - `hist(data)`: Creates a histogram for the dataset. 46 | - `main`: Adds a title to the plot. 47 | - `xlab` and `ylab`: Label the x and y axes. 48 | 49 | --- 50 | 51 | ### **3. Customizing Histograms** 52 | 53 | #### **Changing the Number of Bins** 54 | You can specify the number of bins using the `breaks` parameter. 55 | ```R 56 | # Customizing number of bins 57 | hist(data, 58 | main = "Histogram with Custom Bins", 59 | xlab = "Values", 60 | ylab = "Frequency", 61 | breaks = 5) 62 | ``` 63 | 64 | **Explanation:** 65 | - `breaks = 5`: Divides the data into 5 bins. 66 | 67 | #### **Adding Colors** 68 | ```R 69 | # Adding colors 70 | hist(data, 71 | main = "Histogram with Colors", 72 | xlab = "Values", 73 | ylab = "Frequency", 74 | col = "lightblue", 75 | border = "darkblue") 76 | ``` 77 | 78 | **Explanation:** 79 | - `col`: Fills the bars with a specified color. 80 | - `border`: Changes the color of the bar outlines. 81 | 82 | --- 83 | 84 | ### **4. Using Density Instead of Frequency** 85 | 86 | A histogram can display **density** instead of frequency by setting the `freq` parameter to `FALSE`. 87 | 88 | ```R 89 | # Density Histogram 90 | hist(data, 91 | main = "Density Histogram", 92 | xlab = "Values", 93 | ylab = "Density", 94 | freq = FALSE, 95 | col = "lightgreen") 96 | 97 | # Adding a density curve 98 | lines(density(data), col = "red", lwd = 2) 99 | ``` 100 | 101 | **Explanation:** 102 | - `freq = FALSE`: Switches the y-axis from frequency to density. 103 | - `lines(density(data))`: Adds a smooth density curve to the histogram. 104 | 105 | --- 106 | 107 | ### **5. Creating a Histogram for Grouped Data** 108 | 109 | You can visualize grouped data using colors or facets. 110 | 111 | #### **Example Dataset** 112 | ```R 113 | # Example Dataset 114 | group_data <- data.frame( 115 | Values = c(5, 7, 8, 12, 15, 18, 20, 22, 24, 30, 35, 40, 45, 10, 25), 116 | Group = rep(c("A", "B"), each = 7) 117 | ) 118 | ``` 119 | 120 | #### **Grouped Histogram** 121 | ```R 122 | # Create separate histograms for each group 123 | hist(group_data$Values[group_data$Group == "A"], 124 | main = "Histogram for Group A", 125 | xlab = "Values", 126 | col = "lightblue") 127 | 128 | hist(group_data$Values[group_data$Group == "B"], 129 | main = "Histogram for Group B", 130 | xlab = "Values", 131 | col = "lightgreen") 132 | ``` 133 | 134 | --- 135 | 136 | ### **6. Advanced Customizations with `ggplot2`** 137 | 138 | For more advanced and aesthetically pleasing histograms, use the `ggplot2` package. 139 | 140 | #### **Install ggplot2 if not already installed** 141 | ```R 142 | install.packages("ggplot2") 143 | ``` 144 | 145 | #### **Create a Histogram with ggplot2** 146 | ```R 147 | library(ggplot2) 148 | 149 | # Example Dataset 150 | data <- data.frame( 151 | Values = c(5, 7, 8, 12, 15, 18, 20, 22, 24, 30, 35, 40, 45) 152 | ) 153 | 154 | # ggplot2 Histogram 155 | ggplot(data, aes(x = Values)) + 156 | geom_histogram(binwidth = 5, fill = "skyblue", color = "black") + 157 | ggtitle("Histogram with ggplot2") + 158 | xlab("Values") + 159 | ylab("Frequency") + 160 | theme_minimal() 161 | ``` 162 | 163 | **Explanation:** 164 | - `aes(x = Values)`: Maps the variable to the x-axis. 165 | - `geom_histogram(binwidth = 5)`: Specifies the bin width (size of each interval). 166 | - `fill` and `color`: Customize the bar colors. 167 | 168 | --- 169 | 170 | ### **7. Adding Facets for Grouped Data in ggplot2** 171 | 172 | If you have grouped data, you can create faceted histograms to compare distributions. 173 | 174 | ```R 175 | # Grouped Histogram with ggplot2 176 | group_data <- data.frame( 177 | Values = c(5, 7, 8, 12, 15, 18, 20, 22, 24, 30, 35, 40, 45, 10, 25), 178 | Group = rep(c("A", "B"), each = 7) 179 | ) 180 | 181 | ggplot(group_data, aes(x = Values, fill = Group)) + 182 | geom_histogram(binwidth = 5, position = "dodge", color = "black") + 183 | ggtitle("Grouped Histogram with ggplot2") + 184 | xlab("Values") + 185 | ylab("Frequency") + 186 | theme_minimal() 187 | ``` 188 | 189 | **Explanation:** 190 | - `fill = Group`: Assigns colors to the bars based on groups. 191 | - `position = "dodge"`: Places the bars side by side. 192 | 193 | --- 194 | 195 | ### **8. Saving the Plot** 196 | 197 | Save your histogram as an image file: 198 | ```R 199 | # Save the plot 200 | png("histogram_example.png") 201 | hist(data$Values, main = "Histogram Example", col = "lightblue") 202 | dev.off() 203 | ``` 204 | 205 | --- 206 | 207 | ### **Conclusion** 208 | 209 | Histograms are an excellent tool for visualizing data distributions. Using the `hist()` function in R or the `ggplot2` library, you can create and customize histograms to fit your analysis needs. 210 | 211 | For more R programming tutorials, visit **Codes With Pankaj**! 🚀 212 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Histogram/README.md: -------------------------------------------------------------------------------- 1 | # R Histogram 2 | 3 | In R, a histogram is a graphical representation of the distribution of a continuous numerical variable. It divides the data into "bins" or intervals and counts the number of data points that fall into each bin. Histograms are useful for visualizing the shape and spread of data. Here's how to create a histogram in R: 4 | 5 | **1. Create a Histogram:** 6 | 7 | To create a histogram in R, you can use the `hist()` function. You need to provide the data you want to plot and specify the number of bins (intervals) or let R choose the default number of bins. 8 | 9 | ```R 10 | # Sample data for a histogram 11 | data <- c(22, 25, 27, 30, 32, 32, 33, 35, 36, 38, 39, 40, 40, 41, 42, 43, 45, 45, 46, 50) 12 | 13 | # Create a histogram with default number of bins 14 | hist(data, col = "lightblue", main = "Histogram Example", xlab = "Values", ylab = "Frequency") 15 | ``` 16 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/14386307-5933-4f1c-b521-e4bf61402a92) 17 | 18 | 19 | In this example: 20 | 21 | - `data` contains the numerical data you want to create a histogram for. 22 | - `hist(data)` creates the histogram with the default number of bins. 23 | - `col = "lightblue"` sets the color of the bars. 24 | - `main` and `xlab` are used for the main title and x-axis label, respectively. 25 | 26 | **2. Customizing Histograms:** 27 | 28 | You can customize histograms by specifying the number of bins, changing the color, adding titles, labels, and more. Here are some examples: 29 | 30 | ```R 31 | # Create a histogram with specific number of bins and customizations 32 | hist(data, breaks = 5, col = "lightgreen", main = "Customized Histogram", 33 | xlab = "Values", ylab = "Frequency") 34 | ``` 35 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/a3838500-aba0-452d-9be0-d7513acdb339) 36 | 37 | ```R 38 | # Adding specific bin boundaries 39 | hist(data, breaks = c(20, 30, 40, 50), col = "lightcoral", main = "Histogram with Custom Bins", 40 | xlab = "Values", ylab = "Frequency") 41 | ``` 42 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/ebb2b26a-5e72-4a30-8477-435f10ee1db2) 43 | 44 | ```R 45 | # Adding a density curve 46 | hist(data, breaks = c(20, 30, 45, 50), col = "lightblue", main = "Histogram with Density Curve", 47 | xlab = "Values", ylab = "Frequency", prob = TRUE) 48 | lines(density(data), col = "red") 49 | ``` 50 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/e50dce86-cf8a-40ab-aded-ba7fef9a14cd) 51 | 52 | ```R 53 | # Adding relative frequencies 54 | hist(data, col = "lightgray", main = "Histogram with Relative Frequencies", 55 | xlab = "Values", ylab = "Relative Frequency", prob = TRUE) 56 | ``` 57 | 58 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/47ea99ac-1b42-455c-bcfe-dd414751cd54) 59 | 60 | 61 | These examples show various customizations you can apply to your histograms, including specifying the number of bins, setting different colors, adding density curves, and displaying relative frequencies. 62 | 63 | Histograms are a powerful tool for visualizing the distribution of data, making them a fundamental part of data exploration and analysis in R. 64 | 65 | # Example 66 | 67 | ```R 68 | # Sample data for a histogram 69 | data <- c(22, 25, 27, 30, 32, 32, 33, 35, 36, 38, 39, 40, 40, 41, 42, 43, 45, 45, 46, 50) 70 | 71 | # Create a histogram with customizations 72 | hist(data, 73 | col = "skyblue", # Setting bar color 74 | main = "Customized Histogram", # Adding a title 75 | xlab = "Values", # Label for the x-axis 76 | ylab = "Frequency", # Label for the y-axis 77 | xlim = c(20, 55), # Setting the range of the x-axis 78 | ylim = c(0, 5), # Setting the range of the y-axis 79 | breaks = 5) # Specifying the number of bins 80 | 81 | # Adding a legend for the bar color 82 | legend("topright", legend = "Bar Color", fill = "skyblue") 83 | 84 | # Adding gridlines to the plot 85 | grid() 86 | 87 | 88 | 89 | ``` 90 | ![image](https://github.com/Pankaj-Str/R-Programming-Tutorial/assets/36913690/381aef50-cf4d-43bd-b93d-82b2152e5c7f) 91 | 92 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Pie Chart/Advanced customizations and explanations.md: -------------------------------------------------------------------------------- 1 | # Advanced customizations and explanations 2 | 3 | 4 | ### **1. Understanding the Basics of Pie Charts** 5 | A pie chart is a circular representation of data, where the size of each "slice" corresponds to its proportion relative to the total. It’s primarily used to show percentages or proportions. 6 | 7 | #### Key Arguments of `pie()` Function 8 | - **`x`**: A numeric vector containing the values for the slices. 9 | - **`labels`**: A character vector for labeling the slices. 10 | - **`col`**: A vector of colors for the slices. 11 | - **`main`**: A character string specifying the title of the chart. 12 | - **`radius`**: A value specifying the radius of the pie chart (default is 1). 13 | 14 | --- 15 | 16 | ### **2. Setting Up Your Data** 17 | Your data should ideally be in percentages or values that can be represented as proportions. 18 | 19 | ```R 20 | # Example Data: Sales percentages for four products 21 | sales <- c(40, 25, 20, 15) 22 | categories <- c("Product A", "Product B", "Product C", "Product D") 23 | ``` 24 | 25 | --- 26 | 27 | ### **3. Creating a Basic Pie Chart** 28 | Let’s create a simple pie chart with default settings: 29 | 30 | ```R 31 | pie(sales, labels = categories, main = "Sales Distribution") 32 | ``` 33 | 34 | --- 35 | 36 | ### **4. Adding Percentages to Labels** 37 | To make the chart more informative, add percentages as part of the labels: 38 | 39 | ```R 40 | # Calculate percentages 41 | percentages <- round(sales / sum(sales) * 100, 1) # Rounded to 1 decimal place 42 | 43 | # Add percentages to labels 44 | labels_with_percentages <- paste(categories, "-", percentages, "%") 45 | 46 | # Create the pie chart 47 | pie(sales, labels = labels_with_percentages, main = "Sales Distribution with Percentages") 48 | ``` 49 | 50 | --- 51 | 52 | ### **5. Customizing Colors** 53 | Colors can make your pie chart visually appealing: 54 | 55 | ```R 56 | # Define a custom color palette 57 | colors <- c("#FF9999", "#66B2FF", "#99FF99", "#FFCC99") 58 | 59 | # Apply the colors 60 | pie(sales, 61 | labels = labels_with_percentages, 62 | col = colors, 63 | main = "Customized Pie Chart") 64 | ``` 65 | 66 | --- 67 | 68 | ### **6. Highlighting a Slice (Exploding a Slice)** 69 | Use the `radius` parameter to slightly expand one slice: 70 | 71 | ```R 72 | # Create a function to explode slices 73 | explode_slice <- function(values, explode_index, explode_factor = 0.1) { 74 | theta <- cumsum(values / sum(values) * 2 * pi) 75 | x_shift <- c(0, diff(sin(theta))) * explode_factor 76 | y_shift <- c(0, diff(cos(theta))) * explode_factor 77 | list(x_shift = x_shift, y_shift = y_shift) 78 | } 79 | 80 | # Example: Highlight Product A 81 | pie(sales, 82 | labels = labels_with_percentages, 83 | col = colors, 84 | main = "Exploded Pie Chart", 85 | radius = 1.2) 86 | ``` 87 | 88 | --- 89 | 90 | ### **7. Adding Legends** 91 | If labels are too crowded, use a legend instead: 92 | 93 | ```R 94 | # Create the pie chart without labels 95 | pie(sales, col = colors, main = "Pie Chart with Legend", labels = NA) 96 | 97 | # Add a legend 98 | legend("topright", legend = categories, fill = colors, title = "Products") 99 | ``` 100 | 101 | --- 102 | 103 | ### **8. Creating a Donut Chart** 104 | A donut chart is a variation of the pie chart with a hole in the middle. 105 | 106 | ```R 107 | # Use the plotrix package for donut charts 108 | if (!require("plotrix")) install.packages("plotrix") 109 | library(plotrix) 110 | 111 | # Create a donut chart 112 | pie(sales, 113 | labels = labels_with_percentages, 114 | col = colors, 115 | main = "Donut Chart", 116 | radius = 0.9) 117 | draw.circle(0, 0, 0.4, col = "white") # Create a hole in the center 118 | ``` 119 | 120 | --- 121 | 122 | ### **9. Creating a Pie Chart with ggplot2** 123 | `ggplot2` provides advanced customization for pie charts. However, it uses a bar chart transformed into a pie chart. 124 | 125 | ```R 126 | # Install ggplot2 if needed 127 | if (!require("ggplot2")) install.packages("ggplot2") 128 | library(ggplot2) 129 | 130 | # Prepare the data 131 | data <- data.frame( 132 | categories = categories, 133 | sales = sales 134 | ) 135 | 136 | # Create the pie chart 137 | ggplot(data, aes(x = "", y = sales, fill = categories)) + 138 | geom_bar(stat = "identity", width = 1) + 139 | coord_polar("y", start = 0) + 140 | theme_void() + # Remove background and gridlines 141 | labs(title = "Pie Chart with ggplot2") + 142 | scale_fill_manual(values = colors) 143 | ``` 144 | 145 | --- 146 | 147 | ### **10. Saving the Chart** 148 | Save the plot to an image file: 149 | 150 | ```R 151 | # Save as PNG 152 | png("custom_pie_chart.png", width = 800, height = 600) 153 | pie(sales, 154 | labels = labels_with_percentages, 155 | col = colors, 156 | main = "Saved Pie Chart") 157 | dev.off() 158 | ``` 159 | 160 | --- 161 | 162 | ### **Summary of Advanced Features** 163 | | Feature | Implementation | 164 | |------------------------|-------------------------------| 165 | | Add Percentages | `paste(labels, "-", percentages, "%")` | 166 | | Custom Colors | `col = c("red", "blue", ...)` | 167 | | Exploding Slices | Use `radius` or custom shifts | 168 | | Adding Legends | `legend("topright", ...)` | 169 | | Donut Chart | `draw.circle()` or ggplot2 | 170 | | ggplot2 Customization | `coord_polar()` and `theme_void()` | 171 | 172 | Would you like help with specific customizations or further explanations? 173 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Pie Chart/Example01.md: -------------------------------------------------------------------------------- 1 | # Create a pie chart 2 | 3 | --- 4 | 5 | ## **Step 1: Install and Load Necessary Packages** 6 | R has built-in functions for creating pie charts, so no extra package is required. However, for customization, you might use `ggplot2`. For this tutorial, we will use base R. 7 | 8 | --- 9 | 10 | ## **Step 2: Create the Data** 11 | We need a vector with numerical values representing the pie chart's segments and optionally labels for those segments. 12 | 13 | ```R 14 | # Example data 15 | values <- c(40, 25, 20, 15) # Percentages 16 | labels <- c("Category A", "Category B", "Category C", "Category D") 17 | ``` 18 | 19 | --- 20 | 21 | ## **Step 3: Create a Basic Pie Chart** 22 | The `pie()` function is used to create a pie chart in R. 23 | 24 | ```R 25 | # Basic pie chart 26 | pie(values, labels) 27 | ``` 28 | 29 | --- 30 | 31 | ## **Step 4: Add Labels and Colors** 32 | You can make your pie chart more informative by adding colors and labels. 33 | 34 | ```R 35 | # Add colors and labels 36 | colors <- c("red", "blue", "green", "yellow") 37 | 38 | pie(values, 39 | labels = paste(labels, "-", values, "%"), 40 | col = colors, 41 | main = "Distribution of Categories") 42 | ``` 43 | 44 | --- 45 | 46 | ## **Step 5: Explode a Slice (Optional)** 47 | You can "explode" a specific slice to emphasize it using the `pie()` function's `explode` parameter. 48 | 49 | ```R 50 | # Exploding Category A slice 51 | pie(values, 52 | labels = labels, 53 | col = colors, 54 | main = "Exploded Pie Chart", 55 | radius = 1) 56 | ``` 57 | 58 | --- 59 | 60 | ## **Step 6: Save the Pie Chart** 61 | To save your chart as an image file, use `png()`, `jpeg()`, or similar functions. 62 | 63 | ```R 64 | # Save pie chart as a PNG file 65 | png("pie_chart.png") 66 | pie(values, 67 | labels = paste(labels, "-", values, "%"), 68 | col = colors, 69 | main = "Saved Pie Chart") 70 | dev.off() 71 | ``` 72 | 73 | --- 74 | 75 | ## **Full Example Code** 76 | 77 | ```R 78 | # Data 79 | values <- c(40, 25, 20, 15) 80 | labels <- c("Category A", "Category B", "Category C", "Category D") 81 | colors <- c("red", "blue", "green", "yellow") 82 | 83 | # Basic pie chart 84 | pie(values, labels) 85 | 86 | # Enhanced pie chart with labels, colors, and title 87 | pie(values, 88 | labels = paste(labels, "-", values, "%"), 89 | col = colors, 90 | main = "Distribution of Categories") 91 | ``` 92 | 93 | --- 94 | 95 | -------------------------------------------------------------------------------- /22 Day R Data Visualization /R Pie Chart/README.md: -------------------------------------------------------------------------------- 1 | # R pie chart 2 | 3 | A pie chart is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. 4 | 5 | Pie charts represents data visually as a fractional part of a whole, which can be an effective communication tool. 6 | 7 | 8 | 1. **Create a Pie Chart in R:** 9 | 10 | ```R 11 | # Sample data 12 | data <- data.frame( 13 | category = c("Category A", "Category B", "Category C"), 14 | value = c(30, 45, 25) 15 | ) 16 | 17 | # Create the pie chart 18 | pie_chart <- pie(data$value, labels = data$category, col = c("red", "green", "blue")) 19 | ``` 20 | 21 | 2. **Add a Title to a Pie Chart in R:** 22 | 23 | ```R 24 | # Add a title to the pie chart 25 | title("My Pie Chart") 26 | ``` 27 | 28 | 3. **Add Labels to Each Pie Chart Slice in R:** 29 | 30 | ```R 31 | # Add labels to the pie chart slices 32 | pie_chart <- pie(data$value, labels = paste(data$category, data$value), col = c("red", "green", "blue")) 33 | ``` 34 | 35 | 4. **Change the Color of Pie Slices in R:** 36 | 37 | ```R 38 | # Change the colors of the pie chart slices 39 | pie_chart <- pie(data$value, labels = data$category, col = c("orange", "purple", "pink")) 40 | ``` 41 | 42 | 5. **Create a 3D Pie Chart in R:** 43 | 44 | ```R 45 | # Create a 3D pie chart (requires the 'plotrix' package) 46 | install.packages("plotrix") 47 | library(plotrix) 48 | 49 | # Sample data 50 | data <- c(30, 45, 25) 51 | labels <- c("Category A", "Category B", "Category C") 52 | colors <- c("red", "green", "blue") 53 | 54 | # Create the 3D pie chart 55 | pie3D(data, labels = labels, explode = 0.1, col = colors) 56 | ``` 57 | 58 | You can run each of these code snippets separately in your R environment to achieve the desired tasks without having them all in a single file. 59 | 60 | 61 | Example 62 | 63 | ```R 64 | # Load required libraries 65 | library(ggplot2) 66 | library(plotly) 67 | 68 | # Sample data 69 | data <- data.frame( 70 | category = c("Category A", "Category B", "Category C"), 71 | value = c(30, 45, 25) 72 | ) 73 | 74 | # Basic Pie Chart 75 | basic_pie_chart <- ggplot(data, aes(x = "", y = value, fill = category)) + 76 | geom_bar(stat = "identity", width = 1) + 77 | coord_polar(theta = "y") + 78 | ggtitle("Basic Pie Chart") + 79 | scale_fill_manual(values = c("Category A" = "red", "Category B" = "green", "Category C" = "blue")) + 80 | theme_void() 81 | 82 | # Add Labels to Slices 83 | labels_pie_chart <- basic_pie_chart + 84 | geom_text(aes(label = paste(category, value, "%"), x = 0.5, y = 0.5), position = position_stack(vjust = 0.5)) + 85 | theme_void() 86 | 87 | # 3D Pie Chart 88 | pie_3d <- plot_ly(data, labels = ~category, values = ~value, type = "pie", pull = c(0.1, 0, 0), marker = list(colors = c("red", "green", "blue"))) %>% 89 | layout(title = "3D Pie Chart") 90 | 91 | # Display the pie charts 92 | print(labels_pie_chart) 93 | pie_3d 94 | 95 | ``` -------------------------------------------------------------------------------- /Case Study/01.md: -------------------------------------------------------------------------------- 1 | We will use the famous "Iris" dataset, which is readily available in R. The case study will involve the following steps: 2 | 3 | 1. **Loading and Exploring the Dataset** 4 | 2. **Data Cleaning** 5 | 3. **Data Visualization** 6 | 4. **Splitting the Data** 7 | 5. **Building a Model** 8 | 6. **Evaluating the Model** 9 | 7. **Making Predictions** 10 | 11 | ### Step 1: Loading and Exploring the Dataset 12 | 13 | First, we will load the Iris dataset and take a look at its structure. 14 | 15 | ```r 16 | # Load the dataset 17 | data(iris) 18 | 19 | # View the first few rows 20 | head(iris) 21 | 22 | # Summary of the dataset 23 | summary(iris) 24 | 25 | # Structure of the dataset 26 | str(iris) 27 | ``` 28 | 29 | ### Step 2: Data Cleaning 30 | 31 | The Iris dataset is already clean, but in a typical case study, you would check for missing values and handle them appropriately. 32 | 33 | ```r 34 | # Check for missing values 35 | sum(is.na(iris)) 36 | ``` 37 | 38 | ### Step 3: Data Visualization 39 | 40 | Visualizing the data helps to understand the relationships between different variables. 41 | 42 | ```r 43 | # Load necessary libraries 44 | library(ggplot2) 45 | 46 | # Scatter plot for Sepal.Length vs Sepal.Width colored by Species 47 | ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 48 | geom_point() + 49 | theme_minimal() 50 | 51 | # Pair plot 52 | pairs(iris[1:4], col = iris$Species) 53 | ``` 54 | 55 | ### Step 4: Splitting the Data 56 | 57 | We will split the data into training and testing sets. 58 | 59 | ```r 60 | # Load the caTools package 61 | library(caTools) 62 | 63 | # Set seed for reproducibility 64 | set.seed(123) 65 | 66 | # Split the data 67 | split <- sample.split(iris$Species, SplitRatio = 0.7) 68 | training_set <- subset(iris, split == TRUE) 69 | testing_set <- subset(iris, split == FALSE) 70 | ``` 71 | 72 | ### Step 5: Building a Model 73 | 74 | We will use a decision tree model for this case study. 75 | 76 | ```r 77 | # Load the rpart package 78 | library(rpart) 79 | 80 | # Build the model 81 | model <- rpart(Species ~ ., data = training_set, method = "class") 82 | 83 | # Plot the decision tree 84 | library(rpart.plot) 85 | rpart.plot(model) 86 | ``` 87 | 88 | ### Step 6: Evaluating the Model 89 | 90 | We will evaluate the model's performance on the testing set. 91 | 92 | ```r 93 | # Make predictions 94 | predictions <- predict(model, testing_set, type = "class") 95 | 96 | # Confusion matrix 97 | confusion_matrix <- table(testing_set$Species, predictions) 98 | print(confusion_matrix) 99 | 100 | # Calculate accuracy 101 | accuracy <- sum(diag(confusion_matrix)) / sum(confusion_matrix) 102 | print(paste("Accuracy:", round(accuracy * 100, 2), "%")) 103 | ``` 104 | 105 | ### Step 7: Making Predictions 106 | 107 | Finally, we can use the model to make predictions on new data. 108 | 109 | ```r 110 | # New data for prediction 111 | new_data <- data.frame(Sepal.Length = 5.1, Sepal.Width = 3.5, Petal.Length = 1.4, Petal.Width = 0.2) 112 | 113 | # Predict the species 114 | predicted_species <- predict(model, new_data, type = "class") 115 | print(predicted_species) 116 | ``` 117 | 118 | This step-by-step case study demonstrates how to load, clean, visualize, split, model, evaluate, and make predictions using the Iris dataset in R. You can adapt these steps to other datasets and more complex analyses as needed. 119 | -------------------------------------------------------------------------------- /Case Study/Data Frame Practice Questions.md: -------------------------------------------------------------------------------- 1 | ### **R Data Frame Practice Questions** 2 | 3 | #### 1. Creating a Data Frame 4 | - Create a data frame named `student_data` with the following columns: 5 | - `Name`: Character vector with values: `"Alice"`, `"Bob"`, `"Charlie"` 6 | - `Age`: Numeric vector with values: `22`, `24`, `23` 7 | - `Grade`: Factor vector with values: `"A"`, `"B"`, `"A"` 8 | 9 | #### 2. Accessing Elements in a Data Frame 10 | Using the `student_data` data frame: 11 | 1. Extract the `Name` column. 12 | 2. Access the value in the second row and third column. 13 | 3. Extract rows where the grade is `"A"`. 14 | 15 | #### 3. Adding and Modifying Columns 16 | 1. Add a new column `Subject` with values: `"Math"`, `"Science"`, `"History"`. 17 | 2. Modify the `Age` column by adding 1 to each value. 18 | 19 | #### 4. Subsetting a Data Frame 20 | 1. Display only the `Name` and `Grade` columns. 21 | 2. Filter rows where `Age` is greater than 22. 22 | 23 | #### 5. Summary and Structure 24 | 1. Use the `str()` function to view the structure of the `student_data`. 25 | 2. Use the `summary()` function to get a summary of the data frame. 26 | 27 | #### 6. Sorting and Reordering 28 | 1. Sort the `student_data` by the `Age` column in ascending order. 29 | 2. Reorder the rows by `Grade` in descending order. 30 | 31 | #### 7. Merging and Binding Data Frames 32 | 1. Create another data frame named `new_data` with the following columns: 33 | - `Name`: `"David"`, `"Eva"` 34 | - `Age`: `25`, `23` 35 | - `Grade`: `"B"`, `"A"` 36 | 2. Combine `student_data` and `new_data` using `rbind()`. 37 | 38 | #### 8. Removing Columns and Rows 39 | 1. Remove the `Subject` column from the `student_data`. 40 | 2. Delete the first row from the data frame. 41 | 42 | #### 9. Exporting and Importing Data Frames 43 | 1. Save the `student_data` as a CSV file named `student_data.csv` using `write.csv()`. 44 | 2. Read the file back into R and store it in a variable called `imported_data`. 45 | 46 | #### 10. Bonus Challenge 47 | - Create a data frame `sales_data` with the following columns: 48 | - `Month`: `"Jan"`, `"Feb"`, `"Mar"` 49 | - `Sales`: `1200`, `1500`, `1800` 50 | - `Profit`: `200`, `300`, `400` 51 | - Add a new column `Profit_Percentage` which is calculated as `(Profit / Sales) * 100`. 52 | -------------------------------------------------------------------------------- /Case Study/DataSet.md: -------------------------------------------------------------------------------- 1 | dataset for the EMI Loan Default case study: 2 | 3 | ```csv 4 | CustomerID,Age,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status 5 | 1,35,Male,Yes,0,Graduate,No,5849,0,128,360,1,Urban,Y 6 | 2,32,Male,Yes,1,Graduate,No,4583,1508,128,360,1,Rural,N 7 | 3,25,Female,No,0,Graduate,Yes,3000,0,66,360,1,Urban,Y 8 | 4,26,Female,Yes,1,Graduate,No,2583,2358,120,360,1,Urban,Y 9 | 5,45,Male,Yes,0,Not Graduate,No,6000,0,141,360,1,Semiurban,Y 10 | 6,28,Male,No,0,Graduate,Yes,5417,4196,267,360,1,Semiurban,Y 11 | 7,34,Female,No,0,Graduate,No,2333,1516,95,360,1,Urban,N 12 | 8,29,Male,Yes,2,Graduate,No,5000,0,76,360,0,Rural,Y 13 | 9,42,Male,Yes,2,Not Graduate,Yes,4000,1522,180,360,1,Semiurban,N 14 | 10,27,Female,No,0,Graduate,No,2500,1840,100,360,1,Semiurban,Y 15 | 11,45,Female,Yes,1,Not Graduate,No,3500,0,100,360,1,Urban,Y 16 | 12,37,Male,Yes,0,Graduate,Yes,2889,0,30,180,1,Rural,N 17 | 13,30,Male,No,0,Graduate,Yes,6125,0,150,360,1,Rural,Y 18 | 14,37,Female,No,0,Graduate,Yes,5000,0,140,360,0,Urban,N 19 | 15,35,Male,Yes,1,Graduate,No,5500,0,120,360,1,Rural,Y 20 | 16,40,Female,Yes,3,Not Graduate,Yes,2826,1843,135,360,1,Semiurban,N 21 | 17,29,Female,No,0,Graduate,No,7500,0,210,360,1,Urban,Y 22 | 18,25,Male,Yes,1,Graduate,No,3600,0,84,360,1,Semiurban,Y 23 | 19,46,Female,Yes,2,Not Graduate,Yes,3813,0,75,360,1,Semiurban,N 24 | 20,33,Male,Yes,2,Graduate,No,3796,0,90,360,1,Urban,Y 25 | ``` 26 | 27 | Save this dataset as `loan_data.csv` and use it to perform the steps outlined in the case study. 28 | 29 | ### Creating the CSV file 30 | 31 | Here is the R code to create the CSV file: 32 | 33 | ```r 34 | # Create a data frame with the dataset 35 | loan_data <- data.frame( 36 | CustomerID = 1:20, 37 | Age = c(35, 32, 25, 26, 45, 28, 34, 29, 42, 27, 45, 37, 30, 37, 35, 40, 29, 25, 46, 33), 38 | Gender = c("Male", "Male", "Female", "Female", "Male", "Male", "Female", "Male", "Male", "Female", "Female", "Male", "Male", "Female", "Male", "Female", "Female", "Male", "Female", "Male"), 39 | Married = c("Yes", "Yes", "No", "Yes", "Yes", "No", "No", "Yes", "Yes", "No", "Yes", "Yes", "No", "No", "Yes", "Yes", "No", "Yes", "Yes", "Yes"), 40 | Dependents = c(0, 1, 0, 1, 0, 0, 0, 2, 2, 0, 1, 0, 0, 0, 1, 3, 0, 1, 2, 2), 41 | Education = c("Graduate", "Graduate", "Graduate", "Graduate", "Not Graduate", "Graduate", "Graduate", "Graduate", "Not Graduate", "Graduate", "Not Graduate", "Graduate", "Graduate", "Graduate", "Graduate", "Not Graduate", "Graduate", "Graduate", "Not Graduate", "Graduate"), 42 | Self_Employed = c("No", "No", "Yes", "No", "No", "Yes", "No", "No", "Yes", "No", "No", "Yes", "Yes", "Yes", "No", "Yes", "No", "No", "Yes", "No"), 43 | ApplicantIncome = c(5849, 4583, 3000, 2583, 6000, 5417, 2333, 5000, 4000, 2500, 3500, 2889, 6125, 5000, 5500, 2826, 7500, 3600, 3813, 3796), 44 | CoapplicantIncome = c(0, 1508, 0, 2358, 0, 4196, 1516, 0, 1522, 1840, 0, 0, 0, 0, 0, 1843, 0, 0, 0, 0), 45 | LoanAmount = c(128, 128, 66, 120, 141, 267, 95, 76, 180, 100, 100, 30, 150, 140, 120, 135, 210, 84, 75, 90), 46 | Loan_Amount_Term = c(360, 360, 360, 360, 360, 360, 360, 360, 360, 360, 360, 180, 360, 360, 360, 360, 360, 360, 360, 360), 47 | Credit_History = c(1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1), 48 | Property_Area = c("Urban", "Rural", "Urban", "Urban", "Semiurban", "Semiurban", "Urban", "Rural", "Semiurban", "Semiurban", "Urban", "Rural", "Rural", "Urban", "Rural", "Semiurban", "Urban", "Semiurban", "Semiurban", "Urban"), 49 | Loan_Status = c("Y", "N", "Y", "Y", "Y", "Y", "N", "Y", "N", "Y", "Y", "N", "Y", "N", "Y", "N", "Y", "Y", "N", "Y") 50 | ) 51 | 52 | # Save the data frame to a CSV file 53 | write.csv(loan_data, "loan_data.csv", row.names = FALSE) 54 | ``` 55 | 56 | Run this code in your R environment to create the `loan_data.csv` file, which you can then use for the case study. 57 | -------------------------------------------------------------------------------- /Case Study/Predicting EMI Loan Default.md: -------------------------------------------------------------------------------- 1 | ### Case Study: Predicting EMI Loan Default 2 | 3 | #### Scenario: 4 | 5 | You work for a financial institution that provides loans to customers. The institution wants to develop a model to predict whether a customer will default on their EMI (Equated Monthly Installment) loan. Predicting defaults will help the institution manage risk and make informed lending decisions. You are provided with a dataset containing various features about the customers and their loan status. 6 | 7 | #### Dataset: 8 | 9 | The dataset provided is `loan_data.csv` with the following columns: 10 | 11 | - `CustomerID`: Unique identifier for each customer 12 | - `Age`: Age of the customer 13 | - `Gender`: Gender of the customer (Male, Female) 14 | - `Married`: Whether the customer is married (Yes, No) 15 | - `Dependents`: Number of dependents 16 | - `Education`: Education level (Graduate, Not Graduate) 17 | - `Self_Employed`: Whether the customer is self-employed (Yes, No) 18 | - `ApplicantIncome`: Monthly income of the applicant 19 | - `CoapplicantIncome`: Monthly income of the co-applicant 20 | - `LoanAmount`: Loan amount (in thousands) 21 | - `Loan_Amount_Term`: Term of the loan (in months) 22 | - `Credit_History`: Credit history (1: good, 0: bad) 23 | - `Property_Area`: Area of the property (Urban, Semiurban, Rural) 24 | - `Loan_Status`: Whether the loan was approved (Y, N) 25 | 26 | #### Task: 27 | 28 | 1. **Load and Explore the Dataset:** 29 | - Load the `loan_data.csv` dataset into R. 30 | - Explore the dataset to understand its structure and contents. 31 | 32 | 2. **Data Cleaning:** 33 | - Check for and handle any missing values. 34 | - Encode categorical variables as needed for modeling. 35 | 36 | 3. **Data Visualization:** 37 | - Create visualizations to understand the distribution of features and their relationship with loan default. 38 | 39 | 4. **Feature Engineering:** 40 | - Create any new features that might help in predicting loan defaults. 41 | 42 | 5. **Splitting the Data:** 43 | - Split the data into training and testing sets. 44 | 45 | 6. **Building Models:** 46 | - Build at least two different models to predict loan defaults (e.g., Decision Tree, Logistic Regression). 47 | 48 | 7. **Evaluating the Models:** 49 | - Evaluate the performance of the models using appropriate metrics (e.g., accuracy, confusion matrix, ROC curve). 50 | 51 | 8. **Making Predictions:** 52 | - Use the best-performing model to make predictions on new data. 53 | 54 | 9. **Reporting:** 55 | - Provide a comprehensive report on your findings, including visualizations, model evaluations, and any insights gained from the analysis. 56 | 57 | ### Step-by-Step Solution in R 58 | 59 | 1. **Load and Explore the Dataset:** 60 | 61 | ```r 62 | # Load necessary libraries 63 | library(ggplot2) 64 | library(caTools) 65 | library(caret) 66 | library(rpart) 67 | library(rpart.plot) 68 | library(dplyr) 69 | 70 | # Load the dataset 71 | loan_data <- read.csv("loan_data.csv") 72 | 73 | # View the first few rows 74 | head(loan_data) 75 | 76 | # Summary of the dataset 77 | summary(loan_data) 78 | 79 | # Structure of the dataset 80 | str(loan_data) 81 | ``` 82 | 83 | 2. **Data Cleaning:** 84 | 85 | ```r 86 | # Check for missing values 87 | sum(is.na(loan_data)) 88 | 89 | # Handle missing values (example: impute with median for numerical and mode for categorical) 90 | loan_data$LoanAmount[is.na(loan_data$LoanAmount)] <- median(loan_data$LoanAmount, na.rm = TRUE) 91 | loan_data$Loan_Amount_Term[is.na(loan_data$Loan_Amount_Term)] <- median(loan_data$Loan_Amount_Term, na.rm = TRUE) 92 | loan_data$Credit_History[is.na(loan_data$Credit_History)] <- as.numeric(names(sort(table(loan_data$Credit_History), decreasing = TRUE)[1])) 93 | 94 | loan_data$Gender[is.na(loan_data$Gender)] <- as.character(names(sort(table(loan_data$Gender), decreasing = TRUE)[1])) 95 | loan_data$Married[is.na(loan_data$Married)] <- as.character(names(sort(table(loan_data$Married), decreasing = TRUE)[1])) 96 | loan_data$Dependents[is.na(loan_data$Dependents)] <- as.character(names(sort(table(loan_data$Dependents), decreasing = TRUE)[1])) 97 | loan_data$Self_Employed[is.na(loan_data$Self_Employed)] <- as.character(names(sort(table(loan_data$Self_Employed), decreasing = TRUE)[1])) 98 | 99 | # Encode categorical variables 100 | loan_data$Gender <- as.factor(loan_data$Gender) 101 | loan_data$Married <- as.factor(loan_data$Married) 102 | loan_data$Dependents <- as.factor(loan_data$Dependents) 103 | loan_data$Education <- as.factor(loan_data$Education) 104 | loan_data$Self_Employed <- as.factor(loan_data$Self_Employed) 105 | loan_data$Property_Area <- as.factor(loan_data$Property_Area) 106 | loan_data$Loan_Status <- ifelse(loan_data$Loan_Status == "Y", 1, 0) 107 | ``` 108 | 109 | 3. **Data Visualization:** 110 | 111 | ```r 112 | # Visualize the distribution of loan status 113 | ggplot(loan_data, aes(x = factor(Loan_Status))) + 114 | geom_bar() + 115 | labs(x = "Loan Status", y = "Count", title = "Distribution of Loan Status") 116 | 117 | # Visualize the relationship between LoanAmount and Loan_Status 118 | ggplot(loan_data, aes(x = LoanAmount, fill = factor(Loan_Status))) + 119 | geom_histogram(binwidth = 10, position = "dodge") + 120 | labs(x = "Loan Amount", y = "Count", title = "Loan Amount vs Loan Status") 121 | ``` 122 | 123 | 4. **Feature Engineering:** 124 | 125 | ```r 126 | # Create TotalIncome feature 127 | loan_data$TotalIncome <- loan_data$ApplicantIncome + loan_data$CoapplicantIncome 128 | 129 | # Create EMI feature 130 | loan_data$EMI <- loan_data$LoanAmount / loan_data$Loan_Amount_Term 131 | 132 | # Log transformation for skewed features 133 | loan_data$LoanAmount <- log1p(loan_data$LoanAmount) 134 | loan_data$TotalIncome <- log1p(loan_data$TotalIncome) 135 | loan_data$EMI <- log1p(loan_data$EMI) 136 | ``` 137 | 138 | 5. **Splitting the Data:** 139 | 140 | ```r 141 | # Split the data 142 | set.seed(123) 143 | split <- sample.split(loan_data$Loan_Status, SplitRatio = 0.7) 144 | training_set <- subset(loan_data, split == TRUE) 145 | testing_set <- subset(loan_data, split == FALSE) 146 | ``` 147 | 148 | 6. **Building Models:** 149 | 150 | ```r 151 | # Decision Tree Model 152 | decision_tree_model <- rpart(Loan_Status ~ ., data = training_set, method = "class") 153 | 154 | # Logistic Regression Model 155 | logistic_regression_model <- glm(Loan_Status ~ ., data = training_set, family = binomial) 156 | ``` 157 | 158 | 7. **Evaluating the Models:** 159 | 160 | ```r 161 | # Decision Tree Predictions 162 | dt_predictions <- predict(decision_tree_model, testing_set, type = "class") 163 | 164 | # Confusion Matrix for Decision Tree 165 | confusionMatrix(as.factor(dt_predictions), as.factor(testing_set$Loan_Status)) 166 | 167 | # Logistic Regression Predictions 168 | lr_probabilities <- predict(logistic_regression_model, testing_set, type = "response") 169 | lr_predictions <- ifelse(lr_probabilities > 0.5, 1, 0) 170 | 171 | # Confusion Matrix for Logistic Regression 172 | confusionMatrix(as.factor(lr_predictions), as.factor(testing_set$Loan_Status)) 173 | ``` 174 | 175 | 8. **Making Predictions:** 176 | 177 | ```r 178 | # New customer data 179 | new_customer <- data.frame(Age = 30, Gender = "Male", Married = "Yes", Dependents = "0", 180 | Education = "Graduate", Self_Employed = "No", ApplicantIncome = 5000, 181 | CoapplicantIncome = 0, LoanAmount = log1p(150), Loan_Amount_Term = 360, 182 | Credit_History = 1, Property_Area = "Urban", TotalIncome = log1p(5000), 183 | EMI = log1p(150/360)) 184 | 185 | # Predict using the decision tree model 186 | dt_prediction_new <- predict(decision_tree_model, new_customer, type = "class") 187 | print(dt_prediction_new) 188 | 189 | # Predict using the logistic regression model 190 | lr_probability_new <- predict(logistic_regression_model, new_customer, type = "response") 191 | lr_prediction_new <- ifelse(lr_probability_new > 0.5, 1, 0) 192 | print(lr_prediction_new) 193 | ``` 194 | 195 | This step-by-step case study provides a structured approach to analyzing and predicting EMI loan defaults using R. Feel free to expand on these steps and include more advanced techniques and visualizations as needed. 196 | -------------------------------------------------------------------------------- /Case Study/Predicting EMI Loan Default_Question.md: -------------------------------------------------------------------------------- 1 | ### Case Study: Predicting EMI Loan Default 2 | 3 | #### Scenario: 4 | 5 | You work for a financial institution that provides loans to customers. The institution wants to develop a model to predict whether a customer will default on their EMI (Equated Monthly Installment) loan. Predicting defaults will help the institution manage risk and make informed lending decisions. You are provided with a dataset containing various features about the customers and their loan status. 6 | 7 | #### Dataset: 8 | 9 | The dataset provided is `loan_data.csv` with the following columns: 10 | 11 | - `CustomerID`: Unique identifier for each customer 12 | - `Age`: Age of the customer 13 | - `Gender`: Gender of the customer (Male, Female) 14 | - `Married`: Whether the customer is married (Yes, No) 15 | - `Dependents`: Number of dependents 16 | - `Education`: Education level (Graduate, Not Graduate) 17 | - `Self_Employed`: Whether the customer is self-employed (Yes, No) 18 | - `ApplicantIncome`: Monthly income of the applicant 19 | - `CoapplicantIncome`: Monthly income of the co-applicant 20 | - `LoanAmount`: Loan amount (in thousands) 21 | - `Loan_Amount_Term`: Term of the loan (in months) 22 | - `Credit_History`: Credit history (1: good, 0: bad) 23 | - `Property_Area`: Area of the property (Urban, Semiurban, Rural) 24 | - `Loan_Status`: Whether the loan was approved (Y, N) 25 | 26 | #### Task: 27 | 28 | 1. **Load and Explore the Dataset:** 29 | - Load the `loan_data.csv` dataset into R. 30 | - Explore the dataset to understand its structure and contents. 31 | 32 | 2. **Data Cleaning:** 33 | - Check for and handle any missing values. 34 | - Encode categorical variables as needed for modeling. 35 | 36 | 3. **Data Visualization:** 37 | - Create visualizations to understand the distribution of features and their relationship with loan default. 38 | 39 | 4. **Feature Engineering:** 40 | - Create any new features that might help in predicting loan defaults. 41 | 42 | 5. **Splitting the Data:** 43 | - Split the data into training and testing sets. 44 | 45 | 6. **Building Models:** 46 | - Build at least two different models to predict loan defaults (e.g., Decision Tree, Logistic Regression). 47 | 48 | 7. **Evaluating the Models:** 49 | - Evaluate the performance of the models using appropriate metrics (e.g., accuracy, confusion matrix, ROC curve). 50 | 51 | 8. **Making Predictions:** 52 | - Use the best-performing model to make predictions on new data. 53 | 54 | 9. **Reporting:** 55 | - Provide a comprehensive report on your findings, including visualizations, model evaluations, and any insights gained from the analysis. 56 | 57 | ### Step-by-Step Solution in R 58 | 59 | 1. **Load and Explore the Dataset:** 60 | 61 | ```r 62 | # Load necessary libraries 63 | library(ggplot2) 64 | library(caTools) 65 | library(caret) 66 | library(rpart) 67 | library(rpart.plot) 68 | library(dplyr) 69 | 70 | # Load the dataset 71 | loan_data <- read.csv("loan_data.csv") 72 | 73 | # View the first few rows 74 | head(loan_data) 75 | 76 | # Summary of the dataset 77 | summary(loan_data) 78 | 79 | # Structure of the dataset 80 | str(loan_data) 81 | ``` 82 | 83 | 2. **Data Cleaning:** 84 | 85 | ```r 86 | # Check for missing values 87 | sum(is.na(loan_data)) 88 | 89 | # Handle missing values (example: impute with median for numerical and mode for categorical) 90 | loan_data$LoanAmount[is.na(loan_data$LoanAmount)] <- median(loan_data$LoanAmount, na.rm = TRUE) 91 | loan_data$Loan_Amount_Term[is.na(loan_data$Loan_Amount_Term)] <- median(loan_data$Loan_Amount_Term, na.rm = TRUE) 92 | loan_data$Credit_History[is.na(loan_data$Credit_History)] <- as.numeric(names(sort(table(loan_data$Credit_History), decreasing = TRUE)[1])) 93 | 94 | loan_data$Gender[is.na(loan_data$Gender)] <- as.character(names(sort(table(loan_data$Gender), decreasing = TRUE)[1])) 95 | loan_data$Married[is.na(loan_data$Married)] <- as.character(names(sort(table(loan_data$Married), decreasing = TRUE)[1])) 96 | loan_data$Dependents[is.na(loan_data$Dependents)] <- as.character(names(sort(table(loan_data$Dependents), decreasing = TRUE)[1])) 97 | loan_data$Self_Employed[is.na(loan_data$Self_Employed)] <- as.character(names(sort(table(loan_data$Self_Employed), decreasing = TRUE)[1])) 98 | 99 | # Encode categorical variables 100 | loan_data$Gender <- as.factor(loan_data$Gender) 101 | loan_data$Married <- as.factor(loan_data$Married) 102 | loan_data$Dependents <- as.factor(loan_data$Dependents) 103 | loan_data$Education <- as.factor(loan_data$Education) 104 | loan_data$Self_Employed <- as.factor(loan_data$Self_Employed) 105 | loan_data$Property_Area <- as.factor(loan_data$Property_Area) 106 | loan_data$Loan_Status <- ifelse(loan_data$Loan_Status == "Y", 1, 0) 107 | ``` 108 | 109 | 3. **Data Visualization:** 110 | 111 | ```r 112 | # Visualize the distribution of loan status 113 | ggplot(loan_data, aes(x = factor(Loan_Status))) + 114 | geom_bar() + 115 | labs(x = "Loan Status", y = "Count", title = "Distribution of Loan Status") 116 | 117 | # Visualize the relationship between LoanAmount and Loan_Status 118 | ggplot(loan_data, aes(x = LoanAmount, fill = factor(Loan_Status))) + 119 | geom_histogram(binwidth = 10, position = "dodge") + 120 | labs(x = "Loan Amount", y = "Count", title = "Loan Amount vs Loan Status") 121 | ``` 122 | 123 | 4. **Feature Engineering:** 124 | 125 | ```r 126 | # Create TotalIncome feature 127 | loan_data$TotalIncome <- loan_data$ApplicantIncome + loan_data$CoapplicantIncome 128 | 129 | # Create EMI feature 130 | loan_data$EMI <- loan_data$LoanAmount / loan_data$Loan_Amount_Term 131 | 132 | # Log transformation for skewed features 133 | loan_data$LoanAmount <- log1p(loan_data$LoanAmount) 134 | loan_data$TotalIncome <- log1p(loan_data$TotalIncome) 135 | loan_data$EMI <- log1p(loan_data$EMI) 136 | ``` 137 | 138 | 5. **Splitting the Data:** 139 | 140 | ```r 141 | # Split the data 142 | set.seed(123) 143 | split <- sample.split(loan_data$Loan_Status, SplitRatio = 0.7) 144 | training_set <- subset(loan_data, split == TRUE) 145 | testing_set <- subset(loan_data, split == FALSE) 146 | ``` 147 | 148 | 6. **Building Models:** 149 | 150 | ```r 151 | # Decision Tree Model 152 | decision_tree_model <- rpart(Loan_Status ~ ., data = training_set, method = "class") 153 | 154 | # Logistic Regression Model 155 | logistic_regression_model <- glm(Loan_Status ~ ., data = training_set, family = binomial) 156 | ``` 157 | 158 | 7. **Evaluating the Models:** 159 | 160 | ```r 161 | # Decision Tree Predictions 162 | dt_predictions <- predict(decision_tree_model, testing_set, type = "class") 163 | 164 | # Confusion Matrix for Decision Tree 165 | confusionMatrix(as.factor(dt_predictions), as.factor(testing_set$Loan_Status)) 166 | 167 | # Logistic Regression Predictions 168 | lr_probabilities <- predict(logistic_regression_model, testing_set, type = "response") 169 | lr_predictions <- ifelse(lr_probabilities > 0.5, 1, 0) 170 | 171 | # Confusion Matrix for Logistic Regression 172 | confusionMatrix(as.factor(lr_predictions), as.factor(testing_set$Loan_Status)) 173 | ``` 174 | 175 | 8. **Making Predictions:** 176 | 177 | ```r 178 | # New customer data 179 | new_customer <- data.frame(Age = 30, Gender = "Male", Married = "Yes", Dependents = "0", 180 | Education = "Graduate", Self_Employed = "No", ApplicantIncome = 5000, 181 | CoapplicantIncome = 0, LoanAmount = log1p(150), Loan_Amount_Term = 360, 182 | Credit_History = 1, Property_Area = "Urban", TotalIncome = log1p(5000), 183 | EMI = log1p(150/360)) 184 | 185 | # Predict using the decision tree model 186 | dt_prediction_new <- predict(decision_tree_model, new_customer, type = "class") 187 | print(dt_prediction_new) 188 | 189 | # Predict using the logistic regression model 190 | lr_probability_new <- predict(logistic_regression_model, new_customer, type = "response") 191 | lr_prediction_new <- ifelse(lr_probability_new > 0.5, 1, 0) 192 | print(lr_prediction_new) 193 | ``` 194 | 195 | This step-by-step case study provides a structured approach to analyzing and predicting EMI loan defaults using R. Feel free to expand on these steps and include more advanced techniques and visualizations as needed. 196 | -------------------------------------------------------------------------------- /Case Study/Predicting Employee Attrition.md: -------------------------------------------------------------------------------- 1 | ### Case Study: Predicting Employee Attrition 2 | 3 | #### Scenario: 4 | 5 | You are a data scientist at a large corporation. The Human Resources (HR) department has tasked you with predicting which employees are likely to leave the company (attrition). Reducing employee attrition is critical for the company to save on hiring and training costs. The HR department has provided you with a dataset containing various features about the employees. 6 | 7 | #### Dataset: 8 | 9 | The dataset provided is `employee_attrition.csv` with the following columns: 10 | 11 | - `Age`: Age of the employee 12 | - `BusinessTravel`: Frequency of travel (Rarely, Frequently, Never) 13 | - `Department`: Department of the employee (Sales, Research & Development, Human Resources) 14 | - `DistanceFromHome`: Distance from home to work (in miles) 15 | - `Education`: Education level (1-5) 16 | - `EnvironmentSatisfaction`: Satisfaction with the work environment (1-4) 17 | - `JobInvolvement`: Job involvement level (1-4) 18 | - `JobLevel`: Job level (1-5) 19 | - `JobSatisfaction`: Job satisfaction level (1-4) 20 | - `MonthlyIncome`: Monthly income 21 | - `NumCompaniesWorked`: Number of companies the employee has worked at 22 | - `OverTime`: Whether the employee works overtime (Yes, No) 23 | - `PercentSalaryHike`: Percent increase in salary over the last year 24 | - `PerformanceRating`: Performance rating (1-4) 25 | - `RelationshipSatisfaction`: Relationship satisfaction level (1-4) 26 | - `TotalWorkingYears`: Total years the employee has worked 27 | - `TrainingTimesLastYear`: Number of training sessions attended last year 28 | - `WorkLifeBalance`: Work-life balance satisfaction level (1-4) 29 | - `YearsAtCompany`: Number of years the employee has been at the company 30 | - `YearsInCurrentRole`: Number of years in the current role 31 | - `YearsSinceLastPromotion`: Number of years since the last promotion 32 | - `YearsWithCurrManager`: Number of years with the current manager 33 | - `Attrition`: Whether the employee left the company (Yes, No) 34 | 35 | #### Task: 36 | 37 | 1. **Load and Explore the Dataset:** 38 | - Load the `employee_attrition.csv` dataset into R. 39 | - Explore the dataset to understand its structure and contents. 40 | 41 | 2. **Data Cleaning:** 42 | - Check for and handle any missing values. 43 | - Encode categorical variables as needed for modeling. 44 | 45 | 3. **Data Visualization:** 46 | - Create visualizations to understand the distribution of features and their relationship with attrition. 47 | 48 | 4. **Feature Engineering:** 49 | - Create any new features that might help in predicting attrition. 50 | 51 | 5. **Splitting the Data:** 52 | - Split the data into training and testing sets. 53 | 54 | 6. **Building Models:** 55 | - Build at least two different models to predict employee attrition (e.g., Decision Tree, Logistic Regression). 56 | 57 | 7. **Evaluating the Models:** 58 | - Evaluate the performance of the models using appropriate metrics (e.g., accuracy, confusion matrix, ROC curve). 59 | 60 | 8. **Making Predictions:** 61 | - Use the best-performing model to make predictions on new data. 62 | 63 | 9. **Reporting:** 64 | - Provide a comprehensive report on your findings, including visualizations, model evaluations, and any insights gained from the analysis. 65 | 66 | #### Steps in R: 67 | 68 | Here are some R code snippets to help you get started with each task: 69 | 70 | 1. **Load and Explore the Dataset:** 71 | 72 | ```r 73 | # Load necessary libraries 74 | library(ggplot2) 75 | library(caTools) 76 | library(caret) 77 | library(rpart) 78 | library(rpart.plot) 79 | 80 | # Load the dataset 81 | employee_data <- read.csv("employee_attrition.csv") 82 | 83 | # View the first few rows 84 | head(employee_data) 85 | 86 | # Summary of the dataset 87 | summary(employee_data) 88 | 89 | # Structure of the dataset 90 | str(employee_data) 91 | ``` 92 | 93 | 2. **Data Cleaning:** 94 | 95 | ```r 96 | # Check for missing values 97 | sum(is.na(employee_data)) 98 | 99 | # Encode categorical variables 100 | employee_data$Attrition <- ifelse(employee_data$Attrition == "Yes", 1, 0) 101 | employee_data$OverTime <- ifelse(employee_data$OverTime == "Yes", 1, 0) 102 | employee_data$BusinessTravel <- as.factor(employee_data$BusinessTravel) 103 | employee_data$Department <- as.factor(employee_data$Department) 104 | ``` 105 | 106 | 3. **Data Visualization:** 107 | 108 | ```r 109 | # Visualize the distribution of attrition 110 | ggplot(employee_data, aes(x = factor(Attrition))) + 111 | geom_bar() + 112 | labs(x = "Attrition", y = "Count", title = "Distribution of Attrition") 113 | 114 | # Visualize the relationship between MonthlyIncome and Attrition 115 | ggplot(employee_data, aes(x = MonthlyIncome, fill = factor(Attrition))) + 116 | geom_histogram(binwidth = 1000, position = "dodge") + 117 | labs(x = "Monthly Income", y = "Count", title = "Monthly Income vs Attrition") 118 | ``` 119 | 120 | 4. **Splitting the Data:** 121 | 122 | ```r 123 | # Split the data 124 | set.seed(123) 125 | split <- sample.split(employee_data$Attrition, SplitRatio = 0.7) 126 | training_set <- subset(employee_data, split == TRUE) 127 | testing_set <- subset(employee_data, split == FALSE) 128 | ``` 129 | 130 | 5. **Building Models:** 131 | 132 | ```r 133 | # Decision Tree Model 134 | decision_tree_model <- rpart(Attrition ~ ., data = training_set, method = "class") 135 | 136 | # Logistic Regression Model 137 | logistic_regression_model <- glm(Attrition ~ ., data = training_set, family = binomial) 138 | ``` 139 | 140 | 6. **Evaluating the Models:** 141 | 142 | ```r 143 | # Decision Tree Predictions 144 | dt_predictions <- predict(decision_tree_model, testing_set, type = "class") 145 | 146 | # Confusion Matrix for Decision Tree 147 | confusionMatrix(as.factor(dt_predictions), as.factor(testing_set$Attrition)) 148 | 149 | # Logistic Regression Predictions 150 | lr_probabilities <- predict(logistic_regression_model, testing_set, type = "response") 151 | lr_predictions <- ifelse(lr_probabilities > 0.5, 1, 0) 152 | 153 | # Confusion Matrix for Logistic Regression 154 | confusionMatrix(as.factor(lr_predictions), as.factor(testing_set$Attrition)) 155 | ``` 156 | 157 | 7. **Making Predictions:** 158 | 159 | ```r 160 | # New employee data 161 | new_employee <- data.frame(Age = 35, BusinessTravel = "Travel_Rarely", Department = "Sales", 162 | DistanceFromHome = 10, Education = 3, EnvironmentSatisfaction = 3, 163 | JobInvolvement = 3, JobLevel = 2, JobSatisfaction = 4, 164 | MonthlyIncome = 5000, NumCompaniesWorked = 2, OverTime = 0, 165 | PercentSalaryHike = 15, PerformanceRating = 3, 166 | RelationshipSatisfaction = 3, TotalWorkingYears = 10, 167 | TrainingTimesLastYear = 3, WorkLifeBalance = 3, 168 | YearsAtCompany = 5, YearsInCurrentRole = 2, 169 | YearsSinceLastPromotion = 1, YearsWithCurrManager = 3) 170 | 171 | # Predict using the decision tree model 172 | dt_prediction_new <- predict(decision_tree_model, new_employee, type = "class") 173 | print(dt_prediction_new) 174 | 175 | # Predict using the logistic regression model 176 | lr_probability_new <- predict(logistic_regression_model, new_employee, type = "response") 177 | lr_prediction_new <- ifelse(lr_probability_new > 0.5, 1, 0) 178 | print(lr_prediction_new) 179 | ``` 180 | -------------------------------------------------------------------------------- /Case Study/mtcars_Case Study Question.md: -------------------------------------------------------------------------------- 1 | ### Case Study Question 2 | 3 | **Objective:** 4 | Predict the miles per gallon (mpg) of cars using various features available in the "mtcars" dataset. 5 | 6 | **Dataset Description:** 7 | The "mtcars" dataset consists of 32 observations on 11 variables. The variables are: 8 | 9 | 1. `mpg`: Miles/(US) gallon 10 | 2. `cyl`: Number of cylinders 11 | 3. `disp`: Displacement (cu.in.) 12 | 4. `hp`: Gross horsepower 13 | 5. `drat`: Rear axle ratio 14 | 6. `wt`: Weight (1000 lbs) 15 | 7. `qsec`: 1/4 mile time 16 | 8. `vs`: V/S (engine shape) 17 | 9. `am`: Transmission (0 = automatic, 1 = manual) 18 | 10. `gear`: Number of forward gears 19 | 11. `carb`: Number of carburetors 20 | 21 | **Steps to Follow:** 22 | 23 | 1. **Loading and Exploring the Dataset:** 24 | - Load the "mtcars" dataset. 25 | - View the first few rows, summary statistics, and structure of the dataset. 26 | 27 | 2. **Data Cleaning:** 28 | - Check for missing values and handle them if necessary. 29 | 30 | 3. **Data Visualization:** 31 | - Create visualizations to explore relationships between `mpg` and other features. 32 | 33 | 4. **Splitting the Data:** 34 | - Split the data into training and testing sets. 35 | 36 | 5. **Building a Model:** 37 | - Build a linear regression model to predict `mpg` using the other variables as predictors. 38 | 39 | 6. **Evaluating the Model:** 40 | - Evaluate the model's performance on the testing set using appropriate metrics. 41 | 42 | 7. **Making Predictions:** 43 | - Use the model to make predictions on new data. 44 | 45 | ### Sample Code 46 | 47 | Here is an example code outline to get you started: 48 | 49 | ```r 50 | # Load the dataset 51 | data(mtcars) 52 | 53 | # Step 1: Explore the dataset 54 | head(mtcars) 55 | summary(mtcars) 56 | str(mtcars) 57 | 58 | # Step 2: Data Cleaning 59 | # Check for missing values 60 | sum(is.na(mtcars)) 61 | 62 | # Step 3: Data Visualization 63 | library(ggplot2) 64 | # Scatter plot of mpg vs hp 65 | ggplot(mtcars, aes(x = hp, y = mpg)) + geom_point() + theme_minimal() 66 | # Pair plot 67 | pairs(mtcars) 68 | 69 | # Step 4: Splitting the Data 70 | library(caTools) 71 | set.seed(123) 72 | split <- sample.split(mtcars$mpg, SplitRatio = 0.7) 73 | training_set <- subset(mtcars, split == TRUE) 74 | testing_set <- subset(mtcars, split == FALSE) 75 | 76 | # Step 5: Building the Model 77 | model <- lm(mpg ~ ., data = training_set) 78 | 79 | # Step 6: Evaluating the Model 80 | summary(model) 81 | predictions <- predict(model, testing_set) 82 | # Calculate RMSE 83 | rmse <- sqrt(mean((testing_set$mpg - predictions)^2)) 84 | print(paste("RMSE:", round(rmse, 2))) 85 | 86 | # Step 7: Making Predictions 87 | new_data <- data.frame(cyl = 4, disp = 160, hp = 110, drat = 3.9, wt = 2.62, qsec = 16.46, vs = 0, am = 1, gear = 4, carb = 2) 88 | predicted_mpg <- predict(model, new_data) 89 | print(predicted_mpg) 90 | ``` 91 | 92 | ### Questions to Answer: 93 | 94 | 1. What is the relationship between horsepower (`hp`) and miles per gallon (`mpg`)? 95 | 2. Which features are most significant in predicting `mpg`? 96 | 3. What is the Root Mean Square Error (RMSE) of your model on the testing set? 97 | 4. What would be the predicted `mpg` for a car with the following characteristics: 98 | - 4 cylinders 99 | - 160 cu.in. displacement 100 | - 110 horsepower 101 | - 3.9 rear axle ratio 102 | - 2.62 weight (1000 lbs) 103 | - 16.46 1/4 mile time 104 | - 0 V/S 105 | - Manual transmission (1) 106 | - 4 forward gears 107 | - 2 carburetors 108 | 109 | This case study provides a comprehensive exercise in data analysis, visualization, modeling, and prediction using R. 110 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # R Programming Tutorial 2 | 3 | R is a powerful and widely-used programming language and environment specifically designed for statistical computing and data analysis. It was developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, in the early 1990s. R is open-source, meaning it is freely available for anyone to use, modify, and distribute. 4 | 5 | Here's an introduction to R programming: 6 | 7 | 1. **Statistical Computing**: R was created with a primary focus on statistical analysis and data manipulation. It provides an extensive array of statistical and graphical techniques, making it an ideal choice for researchers, statisticians, data analysts, and data scientists. 8 | 9 | 2. **Open Source**: R is an open-source language, which means that the source code is freely available for users to examine, modify, and distribute. This open nature has contributed to a vibrant and active community of users and developers who continuously enhance its capabilities. 10 | 11 | 3. **Rich Package Ecosystem**: One of the key strengths of R is its vast collection of packages. These packages are extensions that add specialized functions and features to the language. The Comprehensive R Archive Network (CRAN) hosts thousands of packages, covering various domains, from machine learning and data visualization to econometrics and bioinformatics. 12 | 13 | 4. **Data Manipulation**: R provides powerful tools for data manipulation and cleaning. The `dplyr` package, for instance, simplifies tasks such as filtering, summarizing, and joining data tables, making it easier to work with complex datasets. 14 | 15 | 5. **Data Visualization**: R is renowned for its data visualization capabilities. The `ggplot2` package, created by Hadley Wickham, is a widely-used tool for creating elegant and customized data visualizations, including scatter plots, bar charts, and heatmaps. 16 | 17 | 6. **Statistical Modeling**: R supports a wide range of statistical modeling techniques, from linear and logistic regression to more advanced methods like decision trees and neural networks. The `stats` package includes functions for various types of statistical tests and models. 18 | 19 | 7. **Scripting Language**: R is primarily a scripting language, which means you can write code in scripts or interact with it through a command-line interface (REPL). This makes it easy to experiment with data and run code interactively. 20 | 21 | 8. **Cross-Platform**: R is available for multiple platforms, including Windows, macOS, and various Linux distributions, making it accessible to a broad audience. 22 | 23 | 9. **Community and Support**: R has a large and active community of users and developers. You can find extensive documentation, tutorials, forums, and user-contributed resources online. This community support can be invaluable when you encounter challenges or need help with specific tasks. 24 | 25 | 10. **Integration**: R can be easily integrated with other programming languages like Python and C++, as well as with various data storage and database systems. 26 | 27 | In summary, R is a versatile and powerful programming language that excels in statistical computing and data analysis. Whether you are working on data visualization, statistical modeling, or data manipulation, R provides a comprehensive toolkit and a supportive community to help you accomplish your tasks effectively. 28 | --------------------------------------------------------------------------------