Vectors

Author

Dr. Mohammad Nasir Abdullah

Vectors

Before diving into logical vectors and relational operators, it is crucial to understand what a vector is as it forms the basis for these operations in R.

Vector is a basic data structure that holds elements of the same type. It is a sequence of data elements. For example, a numeric vector holds only numeric data, and a character vector holds only character data.

1) Creating a vector

You can create a vector in R using the c() function.

#Example   
numeric_vector <- c(1,2,3,4,5,6)  
character_vector <- c("A","B", "C", "D", "E", "F")

Again, we can assign this to a named object:

x <- c(0,1,3,3,8,8,3,6,4,6) #now x is a 10-element vector

To see the contents of x, simply type:

x
 [1] 0 1 3 3 8 8 3 6 4 6

They symbol can be used to create sequences of increasing (or decreasing) values. For example:

numbers5to20 <- 5:20  
numbers5to20
 [1]  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

vectors can be joined together (i.e: concatenated) with the c() function. For example, note what happens when we type:

c(numbers5to20, x)
 [1]  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20  0  1  3  3  8  8  3  6  4
[26]  6

we can append numbers5to20 to the end of x, and then append the decreasing sequence from 4 to 1:

a.mess <- c(x, numbers5to20, 4:1)  
a.mess
 [1]  0  1  3  3  8  8  3  6  4  6  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19
[26] 20  4  3  2  1

2) Extracting elements from vectors

A nice way to display the 22nd element of a.mess is to use square brackets [ ] to extract just that element:

#Extract 22nd element from a.mess  
a.mess[22]
[1] 16

We can extract more than one element at a time. For example, the 3rd, 6th, and 7th elements of a.mess are:

#Extracting 3rd, 6th, and 7th elements  
a.mess[c(3,6,7)]
[1] 3 8 3

To get the 3rd through 7th elements of a.mess, just type:

a.mess[3:7]
[1] 3 3 8 8 3

Negative indices can be used to omit certain element(s). For Example:

# To omit 3rd element in a.mess  
a.mess[-3]   
 [1]  0  1  3  8  8  3  6  4  6  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
[26]  4  3  2  1
#To omit 2nd to 10th elements in a.mess  
a.mess[-c(2,10)]
 [1]  0  3  3  8  8  3  6  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20  4
[26]  3  2  1

Do not mix positive and negative indices. To see what happens, observe:

a.mess[c(-2, 10)]    

#Error in a.mess[c(-2, 10)] : only 0's may be mixed with negative subscripts 

Always be careful to make sure that vector indices are integer. When fractional values are used, they will be truncated towards 0.

a.mess[0.5]   
#numeric(0)

3) Vector arithmetic

Arithmetic can be done on R vectors. For example, we can multiply all elements of x by 3.

x*3
 [1]  0  3  9  9 24 24  9 18 12 18

Note that the computation is performed element wise. Addition (+), substraction (-), and division (/) by a constant have the same kind of effect. For example:

x - 5 #Substraction   
 [1] -5 -4 -2 -2  3  3 -2  1 -1  1
x / 2  #Division  
 [1] 0.0 0.5 1.5 1.5 4.0 4.0 1.5 3.0 2.0 3.0
x + 2 #Addition
 [1]  2  3  5  5 10 10  5  8  6  8

Next, consider taking the 3rd power of the elements of x:

x^3 #3rd power of the elements of x
 [1]   0   1  27  27 512 512  27 216  64 216

4) Simple patterned vectors

We have seen the use of : the operators for producing simple sequence of integers. Patterned vector can alse be produced using the seq() and rep() functions:

#The sequence of odd numbers less than or equal to 21  

seq(1,21,by=2)
 [1]  1  3  5  7  9 11 13 15 17 19 21

Notice the use of by=2 here. The seq() function has several optional parameters. use ?seq to see the documentation.

rep()

Repeated patterns are obtained using rep(). Consider the following examples:

rep(3, 12) #repeat the value 3, 12 times.   
 [1] 3 3 3 3 3 3 3 3 3 3 3 3
rep(c(1,4), c(3,2)) #repeat 1 (3 times) and 4 (two times).   
[1] 1 1 1 4 4
rep(c(1,4), each = 3) #repeat each values 3 times.   
[1] 1 1 1 4 4 4
rep(1:10, rep(2,10)) #repeat each value twice in a row.   
 [1]  1  1  2  2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10
rep(1:10, rep(2)) # repeat 1 to 10 two times.
 [1]  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10

You can always refer to the help document if you not understand by ?rep.

5) Vector with random patterns

The sample() function allows us to simulate things like the results of the repeated tossing of 6-sided die.

sample(1:6, size = 8, replace = TRUE) #This is an imaginary die tossed 8 times with replacement.     
[1] 2 5 3 3 2 2 2 2
sample(1:6, size = 4, replace = FALSE) #This is an imaginary die tossed 8 times without replacement
[1] 3 1 6 2

6) Characters vectors

Scalars and vectors can be made up of strings of characters instead of numbers. All elements of a vectors must be of the same type. For Example:

colors <- c("red", "yellow", "blue")   

more.colors <- c(colors, "green", "magenta", "pink") #This appended some new elements to colors
#An attempt to mix data types in a vector   

new <- c("green", "yellow", 1)

To see the contents of more.colors and new, simply type;

more.colors    
[1] "red"     "yellow"  "blue"    "green"   "magenta" "pink"   
new
[1] "green"  "yellow" "1"     

Selecting sub-characters letter

There are two basic operations you might want to perform on character vectors. To take substrings, use substr(). It takes arguments substr(x, start, stop), where x is a vector of character strings, and start and stop say which characters to keep. For example, to print the first-two letters of each color use:

substr(colors, 1, 2)
[1] "re" "ye" "bl"

paste()

Another basic operation is building up strings by concatenation. Use the paste() function for this. For example:

# Adding word flowers after the color characters   

paste(colors, "flowers")
[1] "red flowers"    "yellow flowers" "blue flowers"  

There are two optional parameters to paste(). The sep parameter controls what goes between the components being pasted together. We might not want the default space. For example:

#Adding Several infront of colors and adding s after colors character without space   
paste("Several", colors, "s", sep = "")
[1] "Severalreds"    "Severalyellows" "Severalblues"  

paste0()

The paste0() function is a shorthand way to set sep="":

# No need to specify    

paste0("Several", colors, "s")
[1] "Severalreds"    "Severalyellows" "Severalblues"  

The collapse parameter to paste() allows all the components of the resulting vector to be collapsed into a single string:

paste("I Like", colors, collapse = ",")
[1] "I Like red,I Like yellow,I Like blue"

Factor vector

Factor offer an alternative way to store character data. For example, a factor with 4 elements and having the 2 levels control and treatment can be create using:

grp <- c("control", "treatment", "control", "treatment")   

grp  
[1] "control"   "treatment" "control"   "treatment"
#set as factor 

grp <- as.factor(grp) 

grp
[1] control   treatment control   treatment
Levels: control treatment

Factors can be an efficient way to storing character data when there are repeated among the vector elements. This is because the levels of a factor are internally coded as integers. To see what the codes are for our factor, we can type:

as.integer(grp)
[1] 1 2 1 2

The labels for the levels are stored just once each, rather than being repeated. The codes are indices of the vector of levels:

levels(grp)
[1] "control"   "treatment"

The levels() function can be used to change factor labels as well. For example, suppose we wish to change the “control” label to “placebo”.

levels(grp)[1] <- "placebo"

An important use for factors is to list all possible values, even if some are not present. For example:

gender <- factor(c("Female", "Female", "Female"), levels = c("Female", "Male"))   

gender
[1] Female Female Female
Levels: Female Male

It shows that there are two possible values for gender, but only one is present in our vector.

Logical Vectors

A logical vector is a vector that contains TRUE and FALSE values, which represent the results of logical operations. In R, you can create a logical vector by combining different conditions using logical operators (&, |, !).

#Example
logical_vector <- c(TRUE, FALSE, TRUE, TRUE)

You can perform various operations on logical vectors, such as summing (counting TRUE values) or finding which elements are TRUE or FALSE.

#Example 
sum(logical_vector)
[1] 3

Relational Operators

Relational operators are used to compare values and return a logical vector of TRUE or FALSE.

Types of Relational Operators

• > Greater than
• < Less than
• >= Greater than or equal to
• <= Less than or equal to
• == Equal to
• != Not equal to

Using Relational Operators

You can use relational operators to compare vectors, and the comparison is performed element-wise.

# Example 
vector1 <- c(1, 2, 3) 
vector2 <- c(3, 2, 1) 
comparison_result <- vector1 > vector2 # Output: FALSE TRUE TRUE

1) Example - Data Filtering

Use logical vectors and relational operators to filter data based on certain conditions.

# Example 
data <- c(5, 8, 2, 6, 1) 

filtered_data <- data[data > 4] # Output: 5 8 6

2) Example - Counting Specific Conditions

Count the number of elements in a dataset that meet a specific condition.

# Example 
count <- sum(data > 4) # Output: 3

3) Example by using Equality (==)

data <- c(5, 8, 2, 6, 1)
filtered_data <- data[data == 5]  # Returns 5

4) Example by using Inequality (!=)

filtered_data <- data[data != 5]  # Returns 8, 2, 6, 1

5) Example by using Greater than (>)

filtered_data <- data[data > 4]  # Returns 5, 8, 6

6) Example by using Less than (<)

filtered_data <- data[data < 3]  # Returns 2, 1

7) Example by using Less than or Equal to (<=)

filtered_data <- data[data <= 2]  # Returns 2, 1

8) Example using AND (&)

filtered_data <- data[data > 2 & data < 7]  # Returns 5, 6

9) Example using OR ( | )

filtered_data <- data[data < 3 | data > 7]  # Returns 8, 2, 1

10) Example using %in% for multiple values

filtered_data <- data[data %in% c(2, 5)]  # Returns 5, 2

Let’s try with mtcars dataset

Using the mtcars dataset in R:

  1. Filter the rows where the number of cylinders (cyl) is 6.

  2. From the filtered data, extract only the cars that have a miles-per-gallon (mpg) value greater than 20.

  3. Return the names of these cars.

# Load the mtcars dataset
data(mtcars)

# Filter rows based on the conditions
filtered_data <- mtcars[mtcars$cyl == 6 & mtcars$mpg > 20, ]

# Get the names of the cars
car_names <- rownames(filtered_data)
print(car_names)
[1] "Mazda RX4"      "Mazda RX4 Wag"  "Hornet 4 Drive"

Exercise!

  1. Create a numeric variable “num_var” with the value “42.5”.

  2. Create a character variable “char_var” with the value “R is fun!”.

  3. Print the data type “char_var”.

  4. Create a list “student_info” with the following elements:

    1. name”: “Mohammad Nasir Abdullah”

    2. age”: 18

    3. grades”: a numeric vector with values “99, 100, 89”.

  5. Create a data frame “df_students” with the following columns

    1. Name”: “John”, “Pablo”

    2. Age”: “22”, “30”

    3. Grade”: “A”, “C”

  6. Create a numeric vector “vec_num” with the values “5,10,15,20”.

  7. Extract the second and third elements from “vec_num”.

  8. Create a vector “vec_seq” that contains a sequence of number from 1 to 10.

  9. Create a vector “vec_rand” with 5 random numbers between 1 and 100.

  10. Create a character vector “vec_char” with the values “apple”, “banana”, “cherry”.

  11. Use the substr() function to extract the first 3 characters from each elements of “vec_char”.

  12. Create a character vector “vec_colors” with the values “red”, “blue”, “green”, “red”, “blue”.

  13. Convert “vec_colors” into a factor vector “factor_colors”.

  14. Print the levels of “factor_colors”.

  15. Given a numeric vector “vec_bonus = c(10,20,30,40,50)”, write a code snippet to :

    1. Extract all elements greater than 25.

    2. Calculate the mean of the extracted elements.

  16. Given the following vectors:

    weights <- c(55, 68, 72, 61, 58, 75, 64, 70)
    names <- c("Alice", "Bob", "Charlie", "Daisy", "Eve", "Frank", "Grace", "Henry")

    a. Identify the names of individuals whose weight is greater than 65kg.

    b. Find the average weight of individuals weighting less than or equal to 60kg.

    c. Extract the weights of individuals whose names start with the letter ‘A’ or ‘D’.

    d. Determine the number of individuals whose weight is between 60kg and 70kg (inclusive).

Hint:
You might find functions like mean(), substr(), and length() useful.

  1. Given the following vector of daily temperatures (in Celsius) for a week:

    temperatures <- c(22, 25, 19, 21, 18, 24, 23)
    days <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")

    a. Identify the days when the temperature was below 20°C.

    b. Calculate the difference between the highest and lowest temperatures of the week.

    c. Determine the number of days when the temperature was between 20°C and 24°C (inclusive).

  2. Given the following vector of exam scores:

    scores <- c(85, 78, 92, 65, 88, 70, 95, 80, 60, 90)
    students <- c("Anna", "Ben", "Cara", "Dan", "Ella", "Finn", "Grace", "Hank", "Ivy", "Jack")

    a. Identify the students who scored above 90.

    b. Find the average score of students who scored below 80.

    c. Extract the scores of students whose names start with a vowel (A, E, I, O, U).

Take Home Exercise

  1. Basic Vector Creation: Create a numeric vector containing the numbers from 10 to 20 and a character vector containing the days of the week. Display both vectors.

  2. Vector Concatenation: Given two vectors: a <- c(1, 2, 3) and b <- c(4, 5, 6), concatenate them to form a single vector.

  3. Element Extraction: Given the vector temperatures <- c(15, 18, 20, 22, 19, 17, 16), extract the temperatures for the 2nd, 4th, and 6th days.

  4. Omitting Elements: From the same temperatures vector, omit the temperature of the 5th day.

  5. Vector Arithmetic: Given the vector prices <- c(10, 20, 30, 40, 50), calculate the new prices after a 10% discount.

  6. Patterned Vectors: Create a vector that contains the numbers from 5 to 50, but only includes every 5th number (i.e., 5, 10, 15, ...).

  7. Random Patterns: Simulate the results of rolling a 6-sided die 10 times.

  8. Character Vector Operations: Given the vector fruits <- c("apple", "banana", "cherry"), extract the first three letters of each fruit.

  9. Logical Vector Filtering: Given the vector ages <- c(25, 30, 35, 40, 45, 50), identify which ages are greater than 30 and less than 50.

  10. Relational Operators with Vectors: Given two vectors: vector1 <- c(5, 10, 15) and vector2 <- c(10, 10, 20), determine which elements of vector1 are less than or equal to the corresponding elements in vector2.