Before modifying the dataset, it is important to understand what you are working with. This step uses three standard R functions to profile the data frame’s structure, statistical properties, and record uniqueness. Running these checks early surfaces problems — wrong column types, unexpected ranges, missing values, and duplicates — that would otherwise corrupt later analysis.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/frxxxnz/1ACC0216-TB1-2026-1/llms.txt
Use this file to discover all available pages before exploring further.
Inspection code
upc-grupo5-tb1.R
What each function reveals
str(df) — Prints a compact structural overview: the number of observations and variables, the name of each column, its storage type (chr, int, num, logi), and the first few values. This is the fastest way to confirm whether columns imported with the correct types.
summary(df) — For numeric columns, prints the minimum, first quartile, median, mean, third quartile, and maximum. For character columns it prints the length and class. Missing values appear as NA's: n under the affected column, giving an immediate count without extra code.
sum(duplicated(df)) — duplicated() returns a logical vector that is TRUE for every row that is an exact copy of an earlier row. Wrapping it in sum() converts those TRUE values to 1 and gives the total count of duplicate rows. A non-zero result means unique() must be applied before modelling.