#Example
<- c(1,2,3,4,5,6)
numeric_vector <- c("A","B", "C", "D", "E", "F") character_vector
Vectors
Vectors
Before diving into logical vectors and relational operators, it is crucial to understand what a vector is as it forms the basis for these operations in R.
Vector is a basic data structure that holds elements of the same type. It is a sequence of data elements. For example, a numeric vector holds only numeric data, and a character vector holds only character data.
1) Creating a vector
You can create a vector in R using the c()
function.
Again, we can assign this to a named object:
<- c(0,1,3,3,8,8,3,6,4,6) #now x is a 10-element vector x
To see the contents of x
, simply type:
x
[1] 0 1 3 3 8 8 3 6 4 6
They symbol can be used to create sequences of increasing (or decreasing) values. For example:
<- 5:20
numbers5to20 numbers5to20
[1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
vectors can be joined together (i.e: concatenated) with the c()
function. For example, note what happens when we type:
c(numbers5to20, x)
[1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 3 3 8 8 3 6 4
[26] 6
we can append numbers5to20
to the end of x
, and then append the decreasing sequence from 4 to 1:
<- c(x, numbers5to20, 4:1)
a.mess a.mess
[1] 0 1 3 3 8 8 3 6 4 6 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
[26] 20 4 3 2 1
2) Extracting elements from vectors
A nice way to display the 22nd element of a.mess
is to use square brackets [ ]
to extract just that element:
#Extract 22nd element from a.mess
22] a.mess[
[1] 16
We can extract more than one element at a time. For example, the 3rd, 6th, and 7th elements of a.mess are:
#Extracting 3rd, 6th, and 7th elements
c(3,6,7)] a.mess[
[1] 3 8 3
To get the 3rd through 7th elements of a.mess, just type:
3:7] a.mess[
[1] 3 3 8 8 3
Negative indices can be used to omit certain element(s). For Example:
# To omit 3rd element in a.mess
-3] a.mess[
[1] 0 1 3 8 8 3 6 4 6 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
[26] 4 3 2 1
#To omit 2nd to 10th elements in a.mess
-c(2,10)] a.mess[
[1] 0 3 3 8 8 3 6 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 4
[26] 3 2 1
Do not mix positive and negative indices. To see what happens, observe:
c(-2, 10)]
a.mess[
#Error in a.mess[c(-2, 10)] : only 0's may be mixed with negative subscripts
Always be careful to make sure that vector indices are integer. When fractional values are used, they will be truncated towards 0.
0.5]
a.mess[#numeric(0)
3) Vector arithmetic
Arithmetic can be done on R vectors. For example, we can multiply all elements of x
by 3.
*3 x
[1] 0 3 9 9 24 24 9 18 12 18
Note that the computation is performed element wise. Addition (+
), substraction (-
), and division (/
) by a constant have the same kind of effect. For example:
- 5 #Substraction x
[1] -5 -4 -2 -2 3 3 -2 1 -1 1
/ 2 #Division x
[1] 0.0 0.5 1.5 1.5 4.0 4.0 1.5 3.0 2.0 3.0
+ 2 #Addition x
[1] 2 3 5 5 10 10 5 8 6 8
Next, consider taking the 3rd power of the elements of x
:
^3 #3rd power of the elements of x x
[1] 0 1 27 27 512 512 27 216 64 216
4) Simple patterned vectors
We have seen the use of :
the operators for producing simple sequence of integers. Patterned vector can alse be produced using the seq()
and rep()
functions:
#The sequence of odd numbers less than or equal to 21
seq(1,21,by=2)
[1] 1 3 5 7 9 11 13 15 17 19 21
Notice the use of by=2
here. The seq()
function has several optional parameters. use ?seq
to see the documentation.
rep()
Repeated patterns are obtained using rep()
. Consider the following examples:
rep(3, 12) #repeat the value 3, 12 times.
[1] 3 3 3 3 3 3 3 3 3 3 3 3
rep(c(1,4), c(3,2)) #repeat 1 (3 times) and 4 (two times).
[1] 1 1 1 4 4
rep(c(1,4), each = 3) #repeat each values 3 times.
[1] 1 1 1 4 4 4
rep(1:10, rep(2,10)) #repeat each value twice in a row.
[1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10
rep(1:10, rep(2)) # repeat 1 to 10 two times.
[1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
You can always refer to the help document if you not understand by ?rep
.
5) Vector with random patterns
The sample()
function allows us to simulate things like the results of the repeated tossing of 6-sided die.
sample(1:6, size = 8, replace = TRUE) #This is an imaginary die tossed 8 times with replacement.
[1] 2 5 3 3 2 2 2 2
sample(1:6, size = 4, replace = FALSE) #This is an imaginary die tossed 8 times without replacement
[1] 3 1 6 2
6) Characters vectors
Scalars and vectors can be made up of strings of characters instead of numbers. All elements of a vectors must be of the same type. For Example:
<- c("red", "yellow", "blue")
colors
<- c(colors, "green", "magenta", "pink") #This appended some new elements to colors more.colors
#An attempt to mix data types in a vector
<- c("green", "yellow", 1) new
To see the contents of more.colors
and new
, simply type;
more.colors
[1] "red" "yellow" "blue" "green" "magenta" "pink"
new
[1] "green" "yellow" "1"
Selecting sub-characters letter
There are two basic operations you might want to perform on character vectors. To take substrings, use substr()
. It takes arguments substr(x, start, stop)
, where x is a vector of character strings, and start
and stop
say which characters to keep. For example, to print the first-two letters of each color use:
substr(colors, 1, 2)
[1] "re" "ye" "bl"
paste()
Another basic operation is building up strings by concatenation. Use the paste()
function for this. For example:
# Adding word flowers after the color characters
paste(colors, "flowers")
[1] "red flowers" "yellow flowers" "blue flowers"
There are two optional parameters to paste()
. The sep
parameter controls what goes between the components being pasted together. We might not want the default space. For example:
#Adding Several infront of colors and adding s after colors character without space
paste("Several", colors, "s", sep = "")
[1] "Severalreds" "Severalyellows" "Severalblues"
paste0()
The paste0()
function is a shorthand way to set sep=""
:
# No need to specify
paste0("Several", colors, "s")
[1] "Severalreds" "Severalyellows" "Severalblues"
The collapse
parameter to paste()
allows all the components of the resulting vector to be collapsed into a single string:
paste("I Like", colors, collapse = ",")
[1] "I Like red,I Like yellow,I Like blue"
Factor vector
Factor offer an alternative way to store character data. For example, a factor with 4 elements and having the 2 levels control and treatment can be create using:
<- c("control", "treatment", "control", "treatment")
grp
grp
[1] "control" "treatment" "control" "treatment"
#set as factor
<- as.factor(grp)
grp
grp
[1] control treatment control treatment
Levels: control treatment
Factors can be an efficient way to storing character data when there are repeated among the vector elements. This is because the levels of a factor are internally coded as integers. To see what the codes are for our factor, we can type:
as.integer(grp)
[1] 1 2 1 2
The labels for the levels are stored just once each, rather than being repeated. The codes are indices of the vector of levels:
levels(grp)
[1] "control" "treatment"
The levels()
function can be used to change factor labels as well. For example, suppose we wish to change the “control” label to “placebo”.
levels(grp)[1] <- "placebo"
An important use for factors is to list all possible values, even if some are not present. For example:
<- factor(c("Female", "Female", "Female"), levels = c("Female", "Male"))
gender
gender
[1] Female Female Female
Levels: Female Male
It shows that there are two possible values for gender, but only one is present in our vector.
Logical Vectors
A logical vector is a vector that contains TRUE
and FALSE
values, which represent the results of logical operations. In R, you can create a logical vector by combining different conditions using logical operators (&
, |
, !
).
#Example
<- c(TRUE, FALSE, TRUE, TRUE) logical_vector
You can perform various operations on logical vectors, such as summing (counting TRUE values) or finding which elements are TRUE
or FALSE
.
#Example
sum(logical_vector)
[1] 3
Relational Operators
Relational operators are used to compare values and return a logical vector of TRUE or FALSE.
Types of Relational Operators
• > Greater than
• < Less than
• >= Greater than or equal to
• <= Less than or equal to
• == Equal to
• != Not equal to
Using Relational Operators
You can use relational operators to compare vectors, and the comparison is performed element-wise.
# Example
<- c(1, 2, 3)
vector1 <- c(3, 2, 1)
vector2 <- vector1 > vector2 # Output: FALSE TRUE TRUE comparison_result
1) Example - Data Filtering
Use logical vectors and relational operators to filter data based on certain conditions.
# Example
<- c(5, 8, 2, 6, 1)
data
<- data[data > 4] # Output: 5 8 6 filtered_data
2) Example - Counting Specific Conditions
Count the number of elements in a dataset that meet a specific condition.
# Example
<- sum(data > 4) # Output: 3 count
3) Example by using Equality (==
)
<- c(5, 8, 2, 6, 1)
data <- data[data == 5] # Returns 5 filtered_data
4) Example by using Inequality (!=
)
<- data[data != 5] # Returns 8, 2, 6, 1 filtered_data
5) Example by using Greater than (>
)
<- data[data > 4] # Returns 5, 8, 6 filtered_data
6) Example by using Less than (<
)
<- data[data < 3] # Returns 2, 1 filtered_data
7) Example by using Less than or Equal to (<=
)
<- data[data <= 2] # Returns 2, 1 filtered_data
8) Example using AND (&
)
<- data[data > 2 & data < 7] # Returns 5, 6 filtered_data
9) Example using OR ( |
)
<- data[data < 3 | data > 7] # Returns 8, 2, 1 filtered_data
10) Example using %in% for multiple values
<- data[data %in% c(2, 5)] # Returns 5, 2 filtered_data
Let’s try with mtcars dataset
Using the mtcars
dataset in R:
Filter the rows where the number of cylinders (
cyl
) is 6.From the filtered data, extract only the cars that have a miles-per-gallon (
mpg
) value greater than 20.Return the names of these cars.
# Load the mtcars dataset
data(mtcars)
# Filter rows based on the conditions
<- mtcars[mtcars$cyl == 6 & mtcars$mpg > 20, ]
filtered_data
# Get the names of the cars
<- rownames(filtered_data)
car_names print(car_names)
[1] "Mazda RX4" "Mazda RX4 Wag" "Hornet 4 Drive"
Exercise!
Create a numeric variable “
num_var
” with the value “42.5”.Create a character variable “
char_var
” with the value “R is fun!”.Print the data type “
char_var
”.Create a list “
student_info
” with the following elements:“
name
”: “Mohammad Nasir Abdullah”“
age
”: 18“
grades
”: a numeric vector with values “99, 100, 89”.
Create a data frame “
df_students
” with the following columns“
Name
”: “John”, “Pablo”“
Age
”: “22”, “30”“
Grade
”: “A”, “C”
Create a numeric vector “
vec_num
” with the values “5,10,15,20”.Extract the second and third elements from “
vec_num
”.Create a vector “
vec_seq
” that contains a sequence of number from 1 to 10.Create a vector “
vec_rand
” with 5 random numbers between 1 and 100.Create a character vector “
vec_char
” with the values “apple”, “banana”, “cherry”.Use the
substr()
function to extract the first 3 characters from each elements of “vec_char
”.Create a character vector “
vec_colors
” with the values “red”, “blue”, “green”, “red”, “blue”.Convert “
vec_colors
” into a factor vector “factor_colors
”.Print the levels of “
factor_colors
”.Given a numeric vector “
vec_bonus = c(10,20,30,40,50)
”, write a code snippet to :Extract all elements greater than 25.
Calculate the mean of the extracted elements.
Given the following vectors:
<- c(55, 68, 72, 61, 58, 75, 64, 70) weights <- c("Alice", "Bob", "Charlie", "Daisy", "Eve", "Frank", "Grace", "Henry") names
a. Identify the names of individuals whose weight is greater than 65kg.
b. Find the average weight of individuals weighting less than or equal to 60kg.
c. Extract the weights of individuals whose names start with the letter ‘
A
’ or ‘D
’.d. Determine the number of individuals whose weight is between 60kg and 70kg (inclusive).
Hint:
You might find functions like mean()
, substr()
, and length()
useful.
Given the following vector of daily temperatures (in Celsius) for a week:
<- c(22, 25, 19, 21, 18, 24, 23) temperatures <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday") days
a. Identify the days when the temperature was below 20°C.
b. Calculate the difference between the highest and lowest temperatures of the week.
c. Determine the number of days when the temperature was between 20°C and 24°C (inclusive).
Given the following vector of exam scores:
<- c(85, 78, 92, 65, 88, 70, 95, 80, 60, 90) scores <- c("Anna", "Ben", "Cara", "Dan", "Ella", "Finn", "Grace", "Hank", "Ivy", "Jack") students
a. Identify the students who scored above 90.
b. Find the average score of students who scored below 80.
c. Extract the scores of students whose names start with a vowel (A, E, I, O, U).
Take Home Exercise
Basic Vector Creation: Create a numeric vector containing the numbers from 10 to 20 and a character vector containing the days of the week. Display both vectors.
Vector Concatenation: Given two vectors:
a <- c(1, 2, 3)
andb <- c(4, 5, 6)
, concatenate them to form a single vector.Element Extraction: Given the vector
temperatures <- c(15, 18, 20, 22, 19, 17, 16)
, extract the temperatures for the 2nd, 4th, and 6th days.Omitting Elements: From the same
temperatures
vector, omit the temperature of the 5th day.Vector Arithmetic: Given the vector
prices <- c(10, 20, 30, 40, 50)
, calculate the new prices after a 10% discount.Patterned Vectors: Create a vector that contains the numbers from 5 to 50, but only includes every 5th number (i.e., 5, 10, 15, ...).
Random Patterns: Simulate the results of rolling a 6-sided die 10 times.
Character Vector Operations: Given the vector
fruits <- c("apple", "banana", "cherry")
, extract the first three letters of each fruit.Logical Vector Filtering: Given the vector
ages <- c(25, 30, 35, 40, 45, 50)
, identify which ages are greater than 30 and less than 50.Relational Operators with Vectors: Given two vectors:
vector1 <- c(5, 10, 15)
andvector2 <- c(10, 10, 20)
, determine which elements ofvector1
are less than or equal to the corresponding elements invector2
.