Title: | Helper Tools for Managing Data, Dates, Missing Values, and Text |
---|---|
Description: | An assortment of helper functions for managing data (e.g., rotating values in matrices by a user-defined angle, switching from row- to column-indexing), dates (e.g., intuiting year from messy date strings), handling missing values (e.g., removing elements/rows across multiple vectors or matrices if any have an NA), text (e.g., flushing reports to the console in real-time); and combining data frames with different schema (copying, filling, or concatenating columns or applying functions before combining). |
Authors: | Adam B. Smith [cre, aut] |
Maintainer: | Adam B. Smith <[email protected]> |
License: | GPL (>=3) |
Version: | 1.2.14 |
Built: | 2025-01-26 03:09:34 UTC |
Source: | https://github.com/adamlilith/omnibus |
This function "adds" two lists with the same or different names together. For example, if one list is as l1 <- list(a=1, b="XYZ")
and the second is as l2 <- list(a=3, c=FALSE)
, the output will be as list(a = c(1, 3), b = "XYZ", c = FALSE)
. All elements in each list must have names.
appendLists(...)
appendLists(...)
... |
Two or more lists. All elements must have names. |
If two lists share the same name and these elements have the same class, then they will be merged as-is. If the classes are different, one of them will be coerced to the other (see *Examples*). The output will have elements with the names of all lists.
A list
.
# same data types for same named element l1 <- list(a=1, b="XYZ") l2 <- list(a=3, c=FALSE) appendLists(l1, l2) # different data types for same named element l1 <- list(a=3, b="XYZ") l2 <- list(a="letters", c=FALSE) appendLists(l1, l2)
# same data types for same named element l1 <- list(a=1, b="XYZ") l2 <- list(a=3, c=FALSE) appendLists(l1, l2) # different data types for same named element l1 <- list(a=3, b="XYZ") l2 <- list(a="letters", c=FALSE) appendLists(l1, l2)
This function takes an ordered vector of numeric or character values and finds the pair that bracket a third value, x
. If x
is exactly equal to one of the values in the vector, then a single value equal to x
is returned. If x
falls outside of the range of the vector, then the least/most extreme value of the vector is returned (depending on which side of the distribution of the vector x
resides). Optionally, users can have the function return the index of the values that bracket x
.
bracket(x, by, index = FALSE, inner = TRUE, warn = FALSE)
bracket(x, by, index = FALSE, inner = TRUE, warn = FALSE)
x |
A numeric or character vector. |
by |
A numeric or character vector. These should be sorted (from high to low or low to high... if not, an error will result). |
index |
Logical. If |
inner |
Logical. If |
warn |
Logical. If |
If x
is a single value, then the function will return a numeric vector of length 1 or 2, depending on how many values bracket x
. If all values of by
are the same, then the median index (or value) of by
is returned. If x
is a vector, then the result will be a list with one element per item in x
with each element having the same format as the case when x
is a single value.
by <- 2 * (1:5) bracket(4.2, by) bracket(6.8, by) bracket(3.2, by, index=TRUE) bracket(c(3.2, 9.8, 4), by) bracket(2, c(0, 1, 1, 1, 3, 5), index=TRUE) bracket(3, c(1, 2, 10)) bracket(2.5, c(1, 2, 2, 2, 3, 3), index=TRUE) bracket(2.5, c(1, 2, 2, 2, 3, 3), index=TRUE, inner=FALSE) bracket(2.9, c(1, 2, 2, 2, 3, 3), index=TRUE) bracket(2.9, c(1, 2, 2, 2, 3, 3), index=TRUE, inner=FALSE) by <- 1:10 bracket(-100, by) bracket(100, by)
by <- 2 * (1:5) bracket(4.2, by) bracket(6.8, by) bracket(3.2, by, index=TRUE) bracket(c(3.2, 9.8, 4), by) bracket(2, c(0, 1, 1, 1, 3, 5), index=TRUE) bracket(3, c(1, 2, 10)) bracket(2.5, c(1, 2, 2, 2, 3, 3), index=TRUE) bracket(2.5, c(1, 2, 2, 2, 3, 3), index=TRUE, inner=FALSE) bracket(2.9, c(1, 2, 2, 2, 3, 3), index=TRUE) bracket(2.9, c(1, 2, 2, 2, 3, 3), index=TRUE, inner=FALSE) by <- 1:10 bracket(-100, by) bracket(100, by)
Capitalize the first letter of a string or the first letters of a list of strings.
capIt(x)
capIt(x)
x |
Character or character vector. |
Character or character vector.
x <- c('shots', 'were', 'exchanged at the ', 'hospital.') capIt(x)
x <- c('shots', 'were', 'exchanged at the ', 'hospital.') capIt(x)
This function combines multiple "source" data frames, possibly with different column names, into a single "destination" data frame. Usually merge
will be faster and easier to implement if the columns to be merged on have the same names, and rbind
will always be faster and much easier if the column names and data types match exactly.
The key tool in this function is a "crosswalk" table (a data.frame
) that tells the function which fields in each source data frame match to the final fields in the destination data frame. Values in source data frame fields can be used as-is, combined across fields, or have functions applied to them before they are put into the destination data frame. If a source data frame doe snot gave a field that matches the destination field, a default value (including NA
) can be assigned to all cells for that source data frame.
The data frames to be combined can be provided in ...
or as file names in the first column of the crosswalk table. These can be either CSV files (extension ".csv
"), TAB files (extension ".tab
"), "Rdata" files (read using load
and with a ".rda
" or ".rdata
" extension), or "RDS" files (read using readRDS
and with a ".rds
" extension). The file type will be intuited from the extension, and its case does not matter. Note that if an object in an Rdata file has the same name as an object in this function (i.e., any of the arguments plus any objects internal to the function), this may cause a conflict. To help obviate this issue, all internal objects are named with a period at the end (e.g., "crossCell.
" and "countDf.
").
All cells in each source data frame will have leading and trailing white spaces removed before combining.
combineDf( ..., crosswalk, collapse = "; ", useColumns = NULL, excludeColumns = NULL, useFrames = NULL, classes = NULL, verbose = FALSE )
combineDf( ..., crosswalk, collapse = "; ", useColumns = NULL, excludeColumns = NULL, useFrames = NULL, classes = NULL, verbose = FALSE )
... |
Data frames to combine. These must be listed in the order that they appear in the |
crosswalk |
A Other than this column, the elements of each cell contain the name of the column in each source data frame that coincides with the column name in the More complex operations can be done using the following in cells of
|
collapse |
Character, specifies the string to put between fields combined with the |
useColumns , excludeColumns
|
Logical, character vector, integer vector, or |
useFrames |
Logical, character, or |
classes |
Character or character list, specifies the classes (e.g., character, logical, numeric, integer) to be assigned to each column in the output table. If |
verbose |
Logical, if |
A data frame.
df1 <- data.frame(x1=1:5, x2=FALSE, x3=letters[1:5], x4=LETTERS[1:5], x5='stuff', x6=11:15) df2 <- data.frame(y1=11:15, y2=rev(letters)[1:5], y3=runif(5)) crosswalk <- data.frame( a = c('x1', 'y1'), b = c('x2', '%fill% TRUE'), c = c('%cat% x3 x4', 'y2'), d = c('x5', '%fill% NA'), e = c('%fun% as.numeric(x6) > 12', '%fun% round(as.numeric(y3), 2)') ) combined <- combineDf(df1, df2, crosswalk=crosswalk) combined
df1 <- data.frame(x1=1:5, x2=FALSE, x3=letters[1:5], x4=LETTERS[1:5], x5='stuff', x6=11:15) df2 <- data.frame(y1=11:15, y2=rev(letters)[1:5], y3=runif(5)) crosswalk <- data.frame( a = c('x1', 'y1'), b = c('x2', '%fill% TRUE'), c = c('%cat% x3 x4', 'y2'), d = c('x5', '%fill% NA'), e = c('%fun% as.numeric(x6) > 12', '%fun% round(as.numeric(y3), 2)') ) combined <- combineDf(df1, df2, crosswalk=crosswalk) combined
These functions compare values while accounting for differences in floating point precision.
compareFloat(x, y, op, tol = .Machine$double.eps^0.5) x %<% y x %<=% y x %==% y x %>=% y x %>% y x %!=% y
compareFloat(x, y, op, tol = .Machine$double.eps^0.5) x %<% y x %<=% y x %==% y x %>=% y x %>% y x %!=% y
x , y
|
Numeric |
op |
Operator for comparison (must be in quotes): |
tol |
Tolerance value: The largest absolute difference between |
TRUE
, FALSE
, or NA
x <- 0.9 - 0.8 y <- 0.8 - 0.7 x < y x %<% y compareFloat(x, y, "<") x <= y x %<=% y compareFloat(x, y, "<=") x == y x %==% y compareFloat(x, y, "==") y > x y %>% x compareFloat(y, x, ">") y >= x y %>=% x compareFloat(y, x, ">=") x != y x %!=% y compareFloat(x, y, "!=")
x <- 0.9 - 0.8 y <- 0.8 - 0.7 x < y x %<% y compareFloat(x, y, "<") x <= y x %<=% y compareFloat(x, y, "<=") x == y x %==% y compareFloat(x, y, "==") y > x y %>% x compareFloat(y, x, ">") y >= x y %>=% x compareFloat(y, x, ">=") x != y x %!=% y compareFloat(x, y, "!=")
Data frame of conversion factors for length or areal units.
conversionFactors
conversionFactors
An object of class 'data.frame'
.
data(conversionFactors) conversionFactors
data(conversionFactors) conversionFactors
This function converts length and area values from one unit to another (e.g., meters to miles, or square yards to acres). Alternatively, it provides the conversion factor for changing one unit to another.
convertUnits(from = NULL, to = NULL, x = NULL)
convertUnits(from = NULL, to = NULL, x = NULL)
from , to
|
Character: Names of the units to convert from/to. Partial matching is used, and case is ignored. Valid values are listed below. The
|
x |
Numeric or |
expandUnits
, conversionFactors
# conversion convertUnits(from = 'm', to = 'km', 250) convertUnits(from = 'm', to = 'mi', 250) convertUnits(from = 'm2', to = 'km2', 250) # conversion factors convertUnits(from = 'm', to = 'km') convertUnits(from = 'm') convertUnits(to = 'm')
# conversion convertUnits(from = 'm', to = 'km', 250) convertUnits(from = 'm', to = 'mi', 250) convertUnits(from = 'm2', to = 'km2', 250) # conversion factors convertUnits(from = 'm', to = 'km') convertUnits(from = 'm') convertUnits(to = 'm')
Return a corner of a matrix or data frame (i.e., upper left, upper right, lower left, lower right).
corner(x, corner = 1, size = 5)
corner(x, corner = 1, size = 5)
x |
Data frame or matrix. |
corner |
Integer in the set |
size |
Positive integer, number of rows and columns to return. If there are fewer columns/rows than indicated then all columns/rows are returned. |
A matrix
or data.frame
.
x <- matrix(1:120, ncol=12, nrow=10) x corner(x, 1) corner(x, 2) corner(x, 3) corner(x, 4)
x <- matrix(1:120, ncol=12, nrow=10) x corner(x, 1) corner(x, 2) corner(x, 3) corner(x, 4)
Count the number of digits after a decimal place. Note that trailing zeros will likely be ignored.
countDecDigits(x)
countDecDigits(x)
x |
Numeric or numeric vector. |
Integer.
countDecDigits(c(1, 1.1, 1.12, 1.123, 1.1234, -1, 0, 10.0000, 10.0010))
countDecDigits(c(1, 1.1, 1.12, 1.123, 1.1234, -1, 0, 10.0000, 10.0010))
This function takes a set of vectors, data frames, or matrices and removes the last values/rows so that they all have a length/number of rows equal to the shortest among them.
cull(...)
cull(...)
... |
Vectors, matrices, or data frames. |
A list with one element per object supplied as an argument to the function.
a <- 1:10 b <- 1:20 c <- letters cull(a, b, c) x <- data.frame(x=1:10, y=letters[1:10]) y <- data.frame(x=1:26, y=letters) cull(x, y)
a <- 1:10 b <- 1:20 c <- letters cull(a, b, c) x <- data.frame(x=1:10, y=letters[1:10]) y <- data.frame(x=1:26, y=letters) cull(x, y)
This function is a somewhat friendlier version of dir.create
in that it automatically sets recursive=TRUE
and showWarnings=FALSE
arguments.
dirCreate(...)
dirCreate(...)
... |
Character string(s). The path and name of the directory to create. Multiple strings will be pasted together into one path, although slashes will not be pasted between them. |
Nothing (creates a directory on the storage system).
Data frame of day of month for each month in a leap year.
domLeap
domLeap
An object of class 'data.frame'
.
data(domLeap) domLeap
data(domLeap) domLeap
Data frame of day of month for each month in a non-leap year.
domNonLeap
domNonLeap
An object of class 'data.frame'
.
data(domNonLeap) domNonLeap
data(domNonLeap) domNonLeap
Data frame of day of year for each month in a leap year.
doyLeap
doyLeap
An object of class 'data.frame'
.
data(doyLeap) doyLeap
data(doyLeap) doyLeap
Data frame of days of year for each month in a non-leap year
doyNonLeap
doyNonLeap
An object of class 'data.frame'
.
data(doyNonLeap) doyNonLeap
data(doyNonLeap) doyNonLeap
This function returns the smallest machine-readable number (equal to .Machine$double.eps
).
eps()
eps()
Numeric value.
eps()
eps()
This function converts abbreviations of length and area units (e.g., "m", "km", and "ha") to their proper names (e.g., "meters", "kilometers", "hectares"). Square areal units are specified using an appended "2", where appropriate (e.g., "m2" means "meters-squared" and will be converted to "meters2").
expandUnits(x)
expandUnits(x)
x |
Character: Abbreviations to convert. Case is ignored.
|
convertUnits
, conversionFactors
expandUnits(c('m', 'm2', 'ac', 'nm2'))
expandUnits(c('m', 'm2', 'ac', 'nm2'))
This function is helpful for Windows systems, where paths are usually expressed with left slashes, whereas R
requires right slashes.
forwardSlash(x)
forwardSlash(x)
x |
A string. |
Character.
forwardSlash("C:\\ecology\\main project")
forwardSlash("C:\\ecology\\main project")
This function inserts values into a vector, lengthening the overall vector. It is different from, say, x[1:3] <- c('a', 'b', 'c')
which simply replaces the values at indices 1 through 3.
insert(x, into, at, warn = TRUE)
insert(x, into, at, warn = TRUE)
x |
Vector of numeric, integer, character, or other values of the class of |
into |
Vector of values into which to insert |
at |
Vector of positions (indices) where |
warn |
If |
Vector.
x <- -1:-3 into <- 10:20 at <- c(1, 3, 14) insert(x, into, at) insert(-1, into, at)
x <- -1:-3 into <- 10:20 at <- c(1, 3, 14) insert(x, into, at) insert(-1, into, at)
This function inserts one or more columns or rows before or after another column or row in a data frame or matrix. It is similar to cbind
except that the inserted column(s)/row(s) can be placed anywhere.
insertCol(x, into, at = NULL, before = TRUE) insertRow(x, into, at = NULL, before = TRUE)
insertCol(x, into, at = NULL, before = TRUE) insertRow(x, into, at = NULL, before = TRUE)
x |
Data frame, matrix, or vector with same number of columns or rows or elements as |
into |
Data frame or matrix into which |
at |
Character, integer, or |
before |
Logical, if |
A data frame.
insertRow()
: Insert a column or row into a data frame or matrix
x <- data.frame(y1=11:15, y2=rev(letters)[1:5]) into <- data.frame(x1=1:5, x2='valid', x3=letters[1:5], x4=LETTERS[1:5], x5='stuff') insertCol(x, into=into, at='x3') insertCol(x, into=into, at='x3', before=FALSE) insertCol(x, into) x <- data.frame(x1=1:3, x2=LETTERS[1:3]) into <- data.frame(x1=11:15, x2='valid') row.names(into) <- letters[1:5] insertRow(x, into=into, at='b') insertRow(x, into=into, at='b', before=FALSE) insertRow(x, into)
x <- data.frame(y1=11:15, y2=rev(letters)[1:5]) into <- data.frame(x1=1:5, x2='valid', x3=letters[1:5], x4=LETTERS[1:5], x5='stuff') insertCol(x, into=into, at='x3') insertCol(x, into=into, at='x3', before=FALSE) insertCol(x, into) x <- data.frame(x1=1:3, x2=LETTERS[1:3]) into <- data.frame(x1=11:15, x2='valid') row.names(into) <- letters[1:5] insertRow(x, into=into, at='b') insertRow(x, into=into, at='b', before=FALSE) insertRow(x, into)
Sometimes numeric values can appear to be whole numbers but are actually represented in the computer as floating-point values. In these cases, simple inspection of a value will not tell you if it is a whole number or not. This function tests if a number is "close enough" to an integer to be a whole number. Note that is.integer
will indicate if a value is of class integer (which if it is, will always be a whole number), but objects of class numeric
will not evaluate to TRUE
even if they are "supposed" to represent integers.
is.wholeNumber(x, tol = .Machine$double.eps^0.5)
is.wholeNumber(x, tol = .Machine$double.eps^0.5)
x |
A numeric or integer vector. |
tol |
Largest absolute difference between a value and its integer representation for it to be considered a whole number. |
A logical vector.
x <- c(4, 12 / 3, 21, 21.1) is.wholeNumber(x)
x <- c(4, 12 / 3, 21, 21.1) is.wholeNumber(x)
Returns TRUE
if the year is a leap year. You can use "negative" years for BCE.
isLeapYear(x)
isLeapYear(x)
x |
Integer or vector of integers representing years. |
Vector of logical values.
isLeapYear(1990:2004) # note 2000 *was* not a leap year isLeapYear(1896:1904) # 1900 was *not* a leap year
isLeapYear(1990:2004) # note 2000 *was* not a leap year isLeapYear(1896:1904) # 1900 was *not* a leap year
These functions work exactly the same as x == TRUE
and x == FALSE
but by default return FALSE
for cases that are NA
.
isTRUENA(x, ifNA = FALSE) isFALSENA(x, ifNA = FALSE)
isTRUENA(x, ifNA = FALSE) isFALSENA(x, ifNA = FALSE)
x |
Logical, or a condition that evaluates to logical, or a vector of logical values or conditions to evaluate. |
ifNA |
Logical, value to return if the result of evaluating |
Logical or value specified in ifNA
.
isFALSENA()
: Vectorized test for truth robust to NA
isTRUE
, isFALSE
, TRUE
, logical
x <- c(TRUE, TRUE, FALSE, NA) x == TRUE isTRUENA(x) x == FALSE isFALSENA(x) isTRUENA(x, ifNA = Inf) # note that isTRUE and isFALSE are not vectorized isTRUE(x) isFALSE(x)
x <- c(TRUE, TRUE, FALSE, NA) x == TRUE isTRUENA(x) x == FALSE isFALSENA(x) isTRUENA(x, ifNA = Inf) # note that isTRUE and isFALSE are not vectorized isTRUE(x) isFALSE(x)
list.files()
This function is a slightly friendlier version of list.files
in that it automatically includes the full.names=TRUE
argument.
listFiles(x, ...)
listFiles(x, ...)
x |
Path name of folder containing files to list. |
... |
Arguments to pass to |
Character.
# list files in location where R is installed listFiles(R.home()) listFiles(R.home(), pattern='README')
# list files in location where R is installed listFiles(R.home()) listFiles(R.home(), pattern='README')
This function returns the lengh of the longest run of a particular numeric value in a numeric vector. A "run" is an uninterrupted sequence of the same number. Runs can be "wrapped" so that if the sequence starts and ends with the target value then it is considered as a consecutive run.
longRun(x, val, wrap = FALSE, na.rm = FALSE)
longRun(x, val, wrap = FALSE, na.rm = FALSE)
x |
Numeric vector. |
val |
Numeric. Value of the elements of |
wrap |
Logical. If |
na.rm |
Logical. If |
Integer.
[base::rle()]
x <- c(1, 1, 1, 2, 2, 3, 4, 5, 6, 1, 1, 1, 1, 1) longRun(x, 2) longRun(x, 1) longRun(x, 1, wrap=TRUE)
x <- c(1, 1, 1, 2, 2, 3, 4, 5, 6, 1, 1, 1, 1, 1) longRun(x, 2) longRun(x, 1) longRun(x, 1, wrap=TRUE)
Consider an ordered set of values, say 0, 4, 0, 0, 0, 2, 0, 10. We can ask, "What is the number of times in which zeros appear successively?" This function can answer this question and similar ones. What is considered a "run" is defined by a user-supplied function that must have a TRUE
/FALSE
output. For example, a "run" could be any succession of values less than two, in which case the criterion function would be function(x) < 2
, or any succession of values not equal to 0, in which case the function would be function(x) x != 0
.
maxRuns(x, fx, args = NULL, failIfAllNA = FALSE)
maxRuns(x, fx, args = NULL, failIfAllNA = FALSE)
x |
Vector of numeric, character, or other values. |
fx |
A function that returns |
args |
A list object with additional arguments to supply to the function |
failIfAllNA |
If |
Lengths of successive runs of elements that meet the criterion. A single value of 0 indicates no conditions meet the criterion.
x <- c(1, 4, 0, 0, 0, 2, 0, 10) fx <- function(x) x == 0 maxRuns(x, fx) fx <- function(x) x > 0 maxRuns(x, fx) fx <- function(x) x > 0 & x < 5 maxRuns(x, fx) x <- c(1, 4, 0, 0, 0, 2, 0, 10) fx <- function(x, th) x == th maxRuns(x, fx, args=list(th=0)) # "count" NA as an observation x <- c(1, 4, 0, 0, 0, NA, 0, 10) fx <- function(x, th) ifelse(is.na(x), FALSE, x == th) maxRuns(x, fx, args=list(th=0)) # include NAs as part of a run x <- c(1, 4, 0, 0, 0, NA, 0, 10) fx <- function(x, th) ifelse(is.na(x), TRUE, x == th) maxRuns(x, fx, args=list(th=0))
x <- c(1, 4, 0, 0, 0, 2, 0, 10) fx <- function(x) x == 0 maxRuns(x, fx) fx <- function(x) x > 0 maxRuns(x, fx) fx <- function(x) x > 0 & x < 5 maxRuns(x, fx) x <- c(1, 4, 0, 0, 0, 2, 0, 10) fx <- function(x, th) x == th maxRuns(x, fx, args=list(th=0)) # "count" NA as an observation x <- c(1, 4, 0, 0, 0, NA, 0, 10) fx <- function(x, th) ifelse(is.na(x), FALSE, x == th) maxRuns(x, fx, args=list(th=0)) # include NAs as part of a run x <- c(1, 4, 0, 0, 0, NA, 0, 10) fx <- function(x, th) ifelse(is.na(x), TRUE, x == th) maxRuns(x, fx, args=list(th=0))
Displays the largest objects in memory .
memUse(n = 10, orderBy = "size", decreasing = TRUE, pos = 1, ...)
memUse(n = 10, orderBy = "size", decreasing = TRUE, pos = 1, ...)
n |
Positive integer: Maximum number of objects to display. |
orderBy |
Either |
decreasing |
Logical, if |
pos |
Environment from which to obtain size of objects. Default is 1. See |
... |
Other arguments to pass to |
Data frame.
memUse() memUse(3)
memUse() memUse(3)
This function merges two or more lists to create a single, combined list. If two elements in different lists have the same name, items in the later list gain preference (e.g., if there are three lists, then values in the third list gain precedence over items with the same name in the second, and the second has precedence over items in the first).
mergeLists(...)
mergeLists(...)
... |
Two or more lists. |
A list.
list1 <- list(a=1:3, b='Hello world!', c=LETTERS[1:3]) list2 <- list(x=4, b='Goodbye world!', z=letters[1:2]) list3 <- list(x=44, b='What up, world?', z=c('_A_', '_Z_'), w = TRUE) mergeLists(list1, list2) mergeLists(list2, list1) mergeLists(list1, list2, list3) mergeLists(list3, list2, list1)
list1 <- list(a=1:3, b='Hello world!', c=LETTERS[1:3]) list2 <- list(x=4, b='Goodbye world!', z=letters[1:2]) list3 <- list(x=44, b='What up, world?', z=c('_A_', '_Z_'), w = TRUE) mergeLists(list1, list2) mergeLists(list2, list1) mergeLists(list1, list2, list3) mergeLists(list3, list2, list1)
This function creates a "mirror" image of a character string, a number, a matrix, or a data frame. For example "Shots were exchanged at the hospital" becomes "latipsoh eht ta degnahcxe erew stohS' and 3.14159 becomes 95141.3. Data frames and matrices will be returned with the order of columns or order of rows reversed.
mirror(x, direction = "lr")
mirror(x, direction = "lr")
x |
A vector of numeric or character values, or a matrix or data frame. |
direction |
Only used if |
Object with same class as x
.
x <- 'Shots were exchanged at the hospital' mirror(x) x <- c('Water', 'water', 'everywhere') mirror(x) # last value will return NA because the exponentiation does not # make sense when written backwards x <- c(3.14159, 2.71828, 6.02214076e+23) mirror(x) x <- data.frame(x=1:5, y=6:10) mirror(x) x <- matrix(1:10, nrow=2) mirror(x)
x <- 'Shots were exchanged at the hospital' mirror(x) x <- c('Water', 'water', 'everywhere') mirror(x) # last value will return NA because the exponentiation does not # make sense when written backwards x <- c(3.14159, 2.71828, 6.02214076e+23) mirror(x) x <- data.frame(x=1:5, y=6:10) mirror(x) x <- matrix(1:10, nrow=2) mirror(x)
Modal value. If there is more than one unique mode, all modal values are returned.
mmode(x)
mmode(x)
x |
Numeric or character vector. |
Numeric or character vector.
x <- c(1, 2, 3, 3, 4, 5, 3, 1, 2) mmode(x) x <- c(1, 2, 3) mmode(x)
x <- c(1, 2, 3, 3, 4, 5, 3, 1, 2) mmode(x) x <- c(1, 2, 3) mmode(x)
This function and set of operators perform simple (vectorized) comparisons using <
, <=
, >
, >=
, !=
, or ==
between values and always returns TRUE
or FALSE
. TRUE
only occurs if the condition can be evaluated and it is TRUE
. FALSE
is returned if the condition is FALSE
or it cannot be evaluated.
naCompare(op, x, y) x %<na% y x %<=na% y x %==na% y x %!=na% y x %>na% y x %>=na% y
naCompare(op, x, y) x %<na% y x %<=na% y x %==na% y x %!=na% y x %>na% y x %>=na% y
op |
Character, the operation to perform: |
x , y
|
Vectors of numeric, character, |
Vector of logical values.
naCompare('<', c(1, 2, NA), c(10, 1, 0)) naCompare('<', c(1, 2, NA), 10) naCompare('<', c(1, 2, NA), NA) # compare to: NA < 5 NA < NA # same operations with operators: 1 %<na% 2 1 %<na% NA 3 %==na% 3 NA %==na% 3 4 %!=na% 4 4 %!=na% NA 5 %>=na% 3 5 %>=na% NA 3 %==na% c(NA, 1, 2, 3, 4) # compare to: 1 < 2 1 < NA 3 == 3 NA == 3 4 != 4 4 != NA 5 >= 3 5 >= NA 3 == c(NA, 1, 2, 3, 4)
naCompare('<', c(1, 2, NA), c(10, 1, 0)) naCompare('<', c(1, 2, NA), 10) naCompare('<', c(1, 2, NA), NA) # compare to: NA < 5 NA < NA # same operations with operators: 1 %<na% 2 1 %<na% NA 3 %==na% 3 NA %==na% 3 4 %!=na% 4 4 %!=na% NA 5 %>=na% 3 5 %>=na% NA 3 %==na% c(NA, 1, 2, 3, 4) # compare to: 1 < 2 1 < NA 3 == 3 NA == 3 4 != 4 4 != NA 5 >= 3 5 >= NA 3 == c(NA, 1, 2, 3, 4)
This function removes elements in one or more equal-length vectors in which there is one NA
at that position. For example, if there are three vectors A
, B
, and C
, and A
has an NA
in the first position and C
has an NA
in the third position, then A
, B
, and C
will each have the elements at positions 1 and 3 removed.
naOmitMulti(...)
naOmitMulti(...)
... |
Numeric or character vectors. |
List of objects of class ...
.
a <- c(NA, 'b', 'c', 'd', 'e', NA) b <- c(1, 2, 3, NA, 5, NA) c <- c(6, 7, 8, 9, 10, NA) naOmitMulti(a, b, c)
a <- c(NA, 'b', 'c', 'd', 'e', NA) b <- c(1, 2, 3, NA, 5, NA) c <- c(6, 7, 8, 9, 10, NA) naOmitMulti(a, b, c)
NA
This function returns the row number of any row in a data frame or matrix that has at least one NA
. This is the same as which(!complete.cases(x))
.
naRows(x, inf = FALSE, inverse = FALSE)
naRows(x, inf = FALSE, inverse = FALSE)
x |
Data frame or matrix. |
inf |
Logical, if |
inverse |
Logical, if |
Integer vector.
x <- data.frame(a=1:5, b=c(1, 2, NA, 4, 5), c=c('a', 'b', 'c', 'd', NA)) naRows(x)
x <- data.frame(a=1:5, b=c(1, 2, NA, 4, 5), c=c('a', 'b', 'c', 'd', NA)) naRows(x)
Indicate if elements of a vector are not in another vector.
notIn(x, table) x %notin% table
notIn(x, table) x %notin% table
x , table
|
Vectors. |
A logical vector.
x <- c('a', 'v', 'o', 'C', 'a', 'd', 'O') y <- letters y %notin% x x %notin% y
x <- c('a', 'v', 'o', 'C', 'a', 'd', 'O') y <- letters y %notin% x x %notin% y
This function takes two data frames or matrices and returns a matrix of pairwise Euclidean distances between the two.
pairDist(x1, x2 = NULL, na.rm = FALSE)
pairDist(x1, x2 = NULL, na.rm = FALSE)
x1 |
Data frame or matrix one or more columns wide. |
x2 |
Data frame or matrix one or more columns wide. If |
na.rm |
Logical, if |
Matrix with nrow(x1)
rows and nrow(x2)
columns. Values are the distance between each row of x1
and row of x2
.
x1 <- data.frame(x=sample(1:30, 30), y=sort(round(100 * rnorm(30))), z=sample(1:30, 30)) x2 <- data.frame(x=1:20, y=round(100 * rnorm(20)), z=sample(1:20, 20)) pairDist(x1, x2) pairDist(x1)
x1 <- data.frame(x=sample(1:30, 30), y=sort(round(100 * rnorm(30))), z=sample(1:30, 30)) x2 <- data.frame(x=1:20, y=round(100 * rnorm(20)), z=sample(1:20, 20)) pairDist(x1, x2) pairDist(x1)
This function is the same as pmatch
, but it can throw an error instead of NA
if not match is found, and can be forced to throw the error if more than the desired number of matches is found.
pmatchSafe( x, table, useFirst = FALSE, error = TRUE, ignoreCase = TRUE, nmax = length(x), ... )
pmatchSafe( x, table, useFirst = FALSE, error = TRUE, ignoreCase = TRUE, nmax = length(x), ... )
x |
Character: String to match. |
table |
Character vector: Values to which to match. |
useFirst |
Logical: If |
error |
Logical: If no match is found, return an error? |
ignoreCase |
Logical: If |
nmax |
Positive numeric integer: Maximum allowable number of matches. If more than this number of matches is found, an error will be thrown (regardless of the value of |
... |
Arguments to pass to |
One or more of the values in table
.
pmatchSafe('ap', c('apples', 'oranges', 'bananas')) pmatchSafe('AP', c('apples', 'oranges', 'bananas')) pmatchSafe('AP', c('apples', 'oranges', 'bananas'), ignoreCase = FALSE, error = FALSE) pmatchSafe(c('ba', 'ap'), c('apples', 'oranges', 'bananas')) # No match: tryCatch( pmatchSafe('kumquats', c('apples', 'oranges', 'bananas')), error = function(cond) FALSE ) pmatchSafe('kumquats', c('apples', 'oranges', 'bananas'), error = FALSE) pmatchSafe(c('ap', 'corn'), c('apples', 'oranges', 'bananas'), error = FALSE) # Too many matches: tryCatch( pmatchSafe(c('ap', 'ba'), c('apples', 'oranges', 'bananas'), nmax = 1), error=function(cond) FALSE )
pmatchSafe('ap', c('apples', 'oranges', 'bananas')) pmatchSafe('AP', c('apples', 'oranges', 'bananas')) pmatchSafe('AP', c('apples', 'oranges', 'bananas'), ignoreCase = FALSE, error = FALSE) pmatchSafe(c('ba', 'ap'), c('apples', 'oranges', 'bananas')) # No match: tryCatch( pmatchSafe('kumquats', c('apples', 'oranges', 'bananas')), error = function(cond) FALSE ) pmatchSafe('kumquats', c('apples', 'oranges', 'bananas'), error = FALSE) pmatchSafe(c('ap', 'corn'), c('apples', 'oranges', 'bananas'), error = FALSE) # Too many matches: tryCatch( pmatchSafe(c('ap', 'ba'), c('apples', 'oranges', 'bananas'), nmax = 1), error=function(cond) FALSE )
Add leading characters to a string. This function is useful for ensuring, say, files get sorted in a particular order. For example, on some operating systems a file name "file 1" would come first, then "file 10", then "file 11", "file 12", etc., then "file 2", "file 21", and so on. Using prefix
, you can add one or more leading zeros so that file names are as "file 01", "file 02", "file 03", and so on... and they will sort that way.
prefix(x, len, pad = "0")
prefix(x, len, pad = "0")
x |
Character or character vector to which to add a prefix. |
len |
The total number of characters desired for each string. If a string is already this length or longer then nothing will be prefixed to that string. |
pad |
Character. Symbol to prefix to each string. |
Character or character vector.
prefix(1:5, len=2) prefix(1:5, len=5) prefix(1:5, len=3, pad='!')
prefix(1:5, len=2) prefix(1:5, len=5) prefix(1:5, len=3, pad='!')
Calculates the area of a quadrilateral by dividing it into two triangles and applying Heron's formula.
quadArea(x, y)
quadArea(x, y)
x |
Numeric vector. |
y |
Numeric vector. |
Numeric (area of a quadrilateral in same units as x
and y
.
x <- c(0, 6, 4, 1) y <- c(0, 1, 7, 4) quadArea(x, y) plot(1, type='n', xlim=c(0, 7), ylim=c(0, 7), xlab='x', ylab='y') polygon(x, y) text(x, y, LETTERS[1:4], pos=4) lines(x[c(1, 3)], y[c(1, 3)], lty='dashed', col='red')
x <- c(0, 6, 4, 1) y <- c(0, 1, 7, 4) quadArea(x, y) plot(1, type='n', xlim=c(0, 7), ylim=c(0, 7), xlab='x', ylab='y') polygon(x, y) text(x, y, LETTERS[1:4], pos=4) lines(x[c(1, 3)], y[c(1, 3)], lty='dashed', col='red')
Rename columns of a data.frame
or matrix
.
renameCol(x, old, new)
renameCol(x, old, new)
x |
A |
old |
Character vector with names(s), or numeric vector of the indices of the column(s) you want to rename. |
new |
Character vector of new names. |
A data.frame
or matrix
.
x <- data.frame(old_x = 1:5, old_y = letters[1:5], old_z = LETTERS[1:5]) x renameCol(x, c('old_y', 'old_z'), c('new_Y', 'new_Z')) renameCol(x, c(2, 3), c('new_Y', 'new_Z')) # same as above # Long way: new <- c('new_Y', 'new_Z') colnames(x)[match(c('old_y', 'old_z'), colnames(x))] <- new
x <- data.frame(old_x = 1:5, old_y = letters[1:5], old_z = LETTERS[1:5]) x renameCol(x, c('old_y', 'old_z'), c('new_Y', 'new_Z')) renameCol(x, c(2, 3), c('new_Y', 'new_Z')) # same as above # Long way: new <- c('new_Y', 'new_Z') colnames(x)[match(c('old_y', 'old_z'), colnames(x))] <- new
This function renumbers a sequence, which is helpful if "gaps" appear in the sequence. For example, consider the sequence {1, 1, 3, 1, 8, 8, 8}
. This function will renumber the sequence {1, 1, 2, 1, 3, 3, 3}
. NA
s are ignored.
renumSeq(x)
renumSeq(x)
x |
Numerical or character vector. |
A vector.
x <- c(1, 1, 3, 1, 8, 8, 8) renumSeq(x) y <- c(1, 1, 3, 1, 8, NA, 8, 8) renumSeq(y) z <- c('c', 'c', 'b', 'a', 'w', 'a') renumSeq(z)
x <- c(1, 1, 3, 1, 8, 8, 8) renumSeq(x) y <- c(1, 1, 3, 1, 8, NA, 8, 8) renumSeq(y) z <- c('c', 'c', 'b', 'a', 'w', 'a') renumSeq(z)
This function rotates the values in a matrix by a user-specified number of degrees. In almost all cases some values will fall outside the matrix so they will be discarded. Cells that have no rotated values will become NA
. Only square matrices can be accommodated. In some cases a rotation will cause cells to have no assigned value because no original values fall within them. In these instances the mean value of surrounding cells is assigned to the cells with missing values. If the angle of rotation is too small then no rotation will occur.
rotateMatrix(x, rot)
rotateMatrix(x, rot)
x |
A |
rot |
Numeric. Number of degrees to rotate matrix. Values represent difference in degrees between "north" (up) and the clockwise direction. |
A matrix.
[base::t()]
x <- matrix(1:100, nrow=10) x rotateMatrix(x, 90) # 90 degrees to the right rotateMatrix(x, 180) # 180 degrees to the right rotateMatrix(x, 45) # 45 degrees to the right rotateMatrix(x, 7) # slight rotation rotateMatrix(x, 5) # no rotation because angle is too small
x <- matrix(1:100, nrow=10) x rotateMatrix(x, 90) # 90 degrees to the right rotateMatrix(x, 180) # 180 degrees to the right rotateMatrix(x, 45) # 45 degrees to the right rotateMatrix(x, 7) # slight rotation rotateMatrix(x, 5) # no rotation because angle is too small
This function examines a numeric value (typically with numbers after the decimal place) and estimates either:
The number of significant digits of the numerator and denominator of a fraction that would (approximately) result in the given value.
The number of digits to which an integer may have been rounded, depending on whether the input has values after the decimal place or is an integer. Negative values are treated as positive so the negative of a number will returns the same value as its positive version. See Details for more details. Obviously, values can appear to be rounded or repeating even when they are not!
roundedSigDigits(x, minReps = 3)
roundedSigDigits(x, minReps = 3)
x |
Numeric or numeric vector. |
minReps |
Integer. Number of times a digit or sequence of digits that occur after a decimal place needs to be repeated to assume it represents a repeating series and thus is assumed to arise from using decimal places to represent a fraction. Default is 3. For example, if |
For values with at least one non-zero digit after a decimal place with no repeated series of digits detected, the function simply returns the total number of digits (ignoring trailing zeros) times -1. For example:
0.3 returns -1 because there is just one value after the decimal.
0.34567 returns -5 because there are no repeats up to the 5th decimal place.
0.1212125 returns -7 because there are no repeats (starting from the right) up to the 7th decimal place.
0.111117 returns -6 because there are no repeats (starting from the right) up to the 7th decimal place.
The function takes account of rounding up:
0.666 might be a truncated version of 2/3. Two and three each have 1 significant digit, so the function returns -1 (1 value after the decimal place).
0.667 also returns -1 because this might represent a rounding of 2/3 and it is customary to round digits up if the next digit would have been >5.
0.3334 returns -4 because it is inappropriate to round 3 up to 4 if the next digit would have been 5 or less.
Repeating series are accounted for. For example:
0.121212 returns -2 because "12" starts repeating after the second decimal place.
0.000678678678 returns -6 because "678" starts repeating after the 6th place.
0.678678678 returns -3.
0.678678679 also returns -3 because 678 could be rounded to 679 if the next digit were 6.
Note that you can set the minimum number of times a digit or series needs to be repeated to count as being repeated using the argument minReps
. The default is 3, so digits or series of digits need to be repeated at least 3 times to count a repetition, but this can be changed:
0.1111 returns -1 using the default requirement for 3 repetitions but -4 if the number of minimum repetitions is 5 or more.
0.121212 returns -2 using the default requirement for 3 repetitions but -6 if the number of minimum repetitions is 4 or more.
Trailing zeros are ignored, so 0.12300 returns -3. When values do not have digits after a decimal place the location of the first non-zero digit from the right is returned as a positive integer. For example:
234 returns 1 because the first non-zero digit from the right is in the 1s place.
100 return 3 because the first non-zero digit from the right is in the 100s place.
70001 returns 1 because the first non-zero digit from the right is in the 1s place.
However, note a few oddities:
4E5 returns 6 but 4E50 probably will not return 51 because many computers have a hard time internally representing numbers that large.
4E-5 returns -5 but probably will not return -50 because many computers have a hard time internally representing numbers that small.
-100 and 100 return 3 and -0.12 and 0.12 return -2 because the negative sign is ignored.
0 returns 0.
NA
and NaN
returns NA
.
Integer (number of digits) or NA
(does not appear to be rounded).
roundedSigDigits(0.3) roundedSigDigits(0.34567) roundedSigDigits(0.1212125) roundedSigDigits(0.111117) roundedSigDigits(0.666) roundedSigDigits(0.667) roundedSigDigits(0.3334) roundedSigDigits(0.121212) roundedSigDigits(0.000678678678) roundedSigDigits(0.678678678) roundedSigDigits(0.678678679) roundedSigDigits(0.1111) roundedSigDigits(0.1111, minReps=5) roundedSigDigits(0.121212) roundedSigDigits(0.121212, minReps=4) roundedSigDigits(234) roundedSigDigits(100) roundedSigDigits(70001) roundedSigDigits(4E5) roundedSigDigits(4E50) roundedSigDigits(4E-5) roundedSigDigits(4E-50) roundedSigDigits(0) roundedSigDigits(NA) x <- c(0.0667, 0.0667, 0.067) roundedSigDigits(x)
roundedSigDigits(0.3) roundedSigDigits(0.34567) roundedSigDigits(0.1212125) roundedSigDigits(0.111117) roundedSigDigits(0.666) roundedSigDigits(0.667) roundedSigDigits(0.3334) roundedSigDigits(0.121212) roundedSigDigits(0.000678678678) roundedSigDigits(0.678678678) roundedSigDigits(0.678678679) roundedSigDigits(0.1111) roundedSigDigits(0.1111, minReps=5) roundedSigDigits(0.121212) roundedSigDigits(0.121212, minReps=4) roundedSigDigits(234) roundedSigDigits(100) roundedSigDigits(70001) roundedSigDigits(4E5) roundedSigDigits(4E50) roundedSigDigits(4E-5) roundedSigDigits(4E-50) roundedSigDigits(0) roundedSigDigits(NA) x <- c(0.0667, 0.0667, 0.067) roundedSigDigits(x)
This function rounds a value to a nearest "target" value (e.g., you could round 0.72 to the nearest 0.25, or 0.75).
roundTo(x, target, roundFx = round)
roundTo(x, target, roundFx = round)
x |
Numeric. |
target |
Numeric. |
roundFx |
Numeric.
roundTo(0.73, 0.05) roundTo(0.73, 0.1) roundTo(0.73, 0.25) roundTo(0.73, 0.25, floor) roundTo(0.73, 1) roundTo(0.73, 10) roundTo(0.73, 10, ceiling)
roundTo(0.73, 0.05) roundTo(0.73, 0.1) roundTo(0.73, 0.25) roundTo(0.73, 0.25, floor) roundTo(0.73, 1) roundTo(0.73, 10) roundTo(0.73, 10, ceiling)
These functions converts index values of cells between row- and column-style indexing of cells in matrices. Column indexing (the default for matrices) has the cell "1" in the upper left corner of the matrix. The cell "2" is below it, and so on. The numbering then wraps around to the top of the next column. Row indexing (the default for rasters, for example), also has cell "1" in the upper left, but cell "2" is to its right, and so on. Numbering then wraps around to the next row.
rowColIndexing(x, cell, dir)
rowColIndexing(x, cell, dir)
x |
Either a matrix, or a vector with two values, one for the number of rows and one for the number of columns in a matrix. |
cell |
One or more cell indices (positive integers). |
dir |
The "direction" in which to convert. If |
One or more positive integers.
# column versus row indexing colIndex <- matrix(1:40, nrow=5, ncol=8) rowIndex <- matrix(1:40, nrow=5, ncol=8, byrow=TRUE) colIndex rowIndex # examples x <- matrix('a', nrow=5, ncol=8, byrow=TRUE) rowColIndexing(x, cell=c(1, 6, 20), 'row') rowColIndexing(x, cell=c(1, 6, 20), 'col') rowColIndexing(c(5, 8), cell=c(1, 6, 20), 'row') rowColIndexing(c(5, 8), cell=c(1, 6, 20), 'col')
# column versus row indexing colIndex <- matrix(1:40, nrow=5, ncol=8) rowIndex <- matrix(1:40, nrow=5, ncol=8, byrow=TRUE) colIndex rowIndex # examples x <- matrix('a', nrow=5, ncol=8, byrow=TRUE) rowColIndexing(x, cell=c(1, 6, 20), 'row') rowColIndexing(x, cell=c(1, 6, 20), 'col') rowColIndexing(c(5, 8), cell=c(1, 6, 20), 'row') rowColIndexing(c(5, 8), cell=c(1, 6, 20), 'col')
rstring()
makes a string that is statically extremely likely to be unique (using default options).
rstring(n, x = 12, filesafe = TRUE)
rstring(n, x = 12, filesafe = TRUE)
n |
Numeric integer: How many strings to make (default is 1). |
x |
Numeric integer: Number of letters and digits to use to make the string. Default is 12, leading to a probability of two matching random strings of <3.7E-18 if |
filesafe |
Logical: If |
Character.
rstring(1) rstring(5) rstring(5, 3)
rstring(1) rstring(5) rstring(5, 3)
print()
or cat()
functionThis function is a nicer version of print()
or cat()
, especially when used inline for functions because it displays immediately and pastes all strings together. It also does some rudimentary but optional word wrapping.
say( ..., pre = 0, post = 1, breaks = NULL, wiggle = 10, preBreak = 1, level = NULL, deco = "#" )
say( ..., pre = 0, post = 1, breaks = NULL, wiggle = 10, preBreak = 1, level = NULL, deco = "#" )
... |
Character strings to print |
pre |
Integer >= 0. Number of blank lines to print before strings |
post |
Integer >= 0. Number of blank lines to print after strings |
breaks |
Either |
wiggle |
Integer >- 0. Allows line to overrun |
preBreak |
If wrapping long lines indicates how subsequent lines are indented. NULL causes lines to be printed starting at column 1 on the display device. A positive integer inserts |
level |
Integer or |
deco |
Character. Character to decorate text with if |
Nothing (side effect is output on the display device).
say('The quick brown fox ', 'jumps over the lazy ', 'Susan.') say('The quick brown fox ', 'jumps over the lazy ', 'Susan.', breaks=10) say('The quick brown fox ', 'jumps over the lazy ', 'Susan.', level=1) say('The quick brown fox ', 'jumps over the lazy ', 'Susan.', level=2) say('The quick brown fox ', 'jumps over the lazy ', 'Susan.', level=3)
say('The quick brown fox ', 'jumps over the lazy ', 'Susan.') say('The quick brown fox ', 'jumps over the lazy ', 'Susan.', breaks=10) say('The quick brown fox ', 'jumps over the lazy ', 'Susan.', level=1) say('The quick brown fox ', 'jumps over the lazy ', 'Susan.', level=2) say('The quick brown fox ', 'jumps over the lazy ', 'Susan.', level=3)
This function extracts the leftmost or rightmost set of columns of a data frame or matrix.
side(x, side = 1, n = 3)
side(x, side = 1, n = 3)
x |
A |
side |
Either 1 (left side) or 2 (right side), or |
n |
Number of columns. The default is 3. |
A data.frame
or matrix
.
side(iris) side(iris, 2) side(iris, 'l') side(iris, 'r') side(iris, 1, 2)
side(iris) side(iris, 2) side(iris, 'l') side(iris, 'r') side(iris, 1, 2)
This function rescales a vector of numeric values to an arbitrary range. Optionally, after the stretch values equal to the lowest value can be "nudged" slightly higher to half the minimum value across the rescaled vector of values > 0.
stretchMinMax( x, lower = 0, upper = 1, nudgeUp = FALSE, nudgeDown = FALSE, na.rm = FALSE )
stretchMinMax( x, lower = 0, upper = 1, nudgeUp = FALSE, nudgeDown = FALSE, na.rm = FALSE )
x |
Numeric list. |
lower |
Numeric, low end of range to which to stretch. |
upper |
Numeric, high end of range to which to stretch. |
nudgeUp , nudgeDown
|
Logical, if |
na.rm |
Logical, if |
Numeric value.
x <- 1:10 stretchMinMax(x) stretchMinMax(x, lower=2, upper=5) stretchMinMax(x, nudgeUp=TRUE) stretchMinMax(x, lower=2, upper=5, nudgeUp=TRUE) stretchMinMax(x, nudgeDown=TRUE) stretchMinMax(x, lower=2, upper=5, nudgeUp=TRUE, nudgeDown=TRUE) x <- c(1:5, NA) stretchMinMax(x) stretchMinMax(x, na.rm=TRUE)
x <- 1:10 stretchMinMax(x) stretchMinMax(x, lower=2, upper=5) stretchMinMax(x, nudgeUp=TRUE) stretchMinMax(x, lower=2, upper=5, nudgeUp=TRUE) stretchMinMax(x, nudgeDown=TRUE) stretchMinMax(x, lower=2, upper=5, nudgeUp=TRUE, nudgeDown=TRUE) x <- c(1:5, NA) stretchMinMax(x) stretchMinMax(x, na.rm=TRUE)
This function takes as an argument a list
. If any of its elements are also lists, it unlists them. The output is the same as the input, except that there will be one new element per element in each sublist, and the sublists will be removed.
unlistRecursive(x)
unlistRecursive(x)
x |
A |
A list
.
x <- list( a = 1:3, b = list( b1 = c("The", "quick", "brown", "function"), b2 = 4:1, b3 = list( b3_1 = 5:7 ) ), c = "end" ) unlistRecursive(x)
x <- list( a = 1:3, b = list( b1 = c("The", "quick", "brown", "function"), b2 = 4:1, b3 = list( b3_1 = 5:7 ) ), c = "end" ) unlistRecursive(x)
This function turns a "ragged" matrix into a vector. Consider a case where you have a matrix that looks like:
1, 0, 1
2, 3, NA
NA, 4, NA
Here, each row represents a series of values, where missing values are represented by NA
. This can be turned into a vector form going from left to right and top to bottom of the matrix, as in c(1, 0, 1, 2, 3, 4)
, plus a vector c(1, 4, 6)
, which provides the index of the first non-NA
value in each row of the matrix in the vector, plus another vector, c(1, 1, 1, 2, 2, 3)
, indicating the row to which each value in the vector belonged.
unragMatrix(x, skip = NA)
unragMatrix(x, skip = NA)
x |
A matrix. |
skip |
|
A list with one vector per matrix, plus 1) a vector named startIndex
with indices of start values, and 2) a vector named row
with one value per non-skip
value in each matrix.
# default x <- matrix(c(1, 0, 1, 2, 3, NA, NA, 4, NA), byrow = TRUE, nrow = 3) unragMatrix(x) # skip nothing unragMatrix(x, skip = NULL) # skips rows with all "skip" values y <- matrix(c(1, 0, 1, NA, NA, NA, NA, 4, NA), byrow = TRUE, nrow = 3) unragMatrix(y)
# default x <- matrix(c(1, 0, 1, 2, 3, NA, NA, 4, NA), byrow = TRUE, nrow = 3) unragMatrix(x) # skip nothing unragMatrix(x, skip = NULL) # skips rows with all "skip" values y <- matrix(c(1, 0, 1, NA, NA, NA, NA, 4, NA), byrow = TRUE, nrow = 3) unragMatrix(y)
These functions are vectorized versions of which.max
and which.min
, which return the index of the value that is maximum or minimum (or the first maximum/minimum value, if there is a tie). In this case, the function is supplied two or more vectors of the same length. For each element at the same position (e.g., the first element in each vector, then the second element, etc.) the function returns an integer indicating which vector has the highest or lowest value (or the index of the first vector with the highest or lowest value in case of ties).
which.pmax(..., na.rm = TRUE) which.pmin(..., na.rm = TRUE)
which.pmax(..., na.rm = TRUE) which.pmin(..., na.rm = TRUE)
... |
Two or more vectors. If lengths do not match, the results will likely be be unanticipated. |
na.rm |
Logical, if |
Vector the same length as the input, with numeric values indicating which vector has the highest value at that position. In case of ties, the index of the first vector is returned.
which.pmin()
: Which vector has minimum value at each element
which.max
, which.min
, pmax
, pmin
set.seed(123) a <- sample(9, 5) b <- sample(9, 5) c <- sample(9, 5) a[2:3] <- NA b[3] <- NA a[6] <- NA b[6] <- NA c[6] <- NA which.pmax(a, b, c) which.pmin(a, b, c) which.pmax(a, b, c, na.rm=FALSE) which.pmin(a, b, c, na.rm=FALSE)
set.seed(123) a <- sample(9, 5) b <- sample(9, 5) c <- sample(9, 5) a[2:3] <- NA b[3] <- NA a[6] <- NA b[6] <- NA c[6] <- NA which.pmax(a, b, c) which.pmin(a, b, c) which.pmax(a, b, c, na.rm=FALSE) which.pmin(a, b, c, na.rm=FALSE)
This function attempts to return the year from characters representing dates formats. The formats can be ambigous and varied within the same set. For example, it returns "1982" (or 9982 if century is ambiguous) from "11/20/82", "1982-11-20", "Nov. 20, 1982", "20 Nov 1982", "20-Nov-1982", "20/Nov/1982", "20 Nov. 82", "20 Nov 82". The function handles ambiguous centuries (e.g., 1813, 1913, 2013) by including a dummy place holder for the century place (i.e., 9913). Note that it may return warnings like "NAs introduced by coercion".
yearFromDate(x, yearLast = TRUE)
yearFromDate(x, yearLast = TRUE)
x |
Character or character vector, one or more dates. |
yearLast |
Logical, if |
Numeric.
yearFromDate(1969, yearLast=TRUE) yearFromDate('10-Jul-71', yearLast=TRUE) # --> 9971 yearFromDate('10-Jul-1971', yearLast=TRUE) # --> 1971 yearFromDate('10-19-71', yearLast=TRUE) # --> 9971 yearFromDate('10-19-1969', yearLast=TRUE) # --> 1969 yearFromDate('10-1-71', yearLast=TRUE) # --> 9971 yearFromDate('3-22-71', yearLast=TRUE) # --> 9971 yearFromDate('3-2-71', yearLast=TRUE) # --> 9971 yearFromDate('10-1-1969', yearLast=TRUE) # --> 1969 yearFromDate('3-22-1969', yearLast=TRUE) # --> 1969 yearFromDate('3-2-1969', yearLast=TRUE) # --> 1969 yearFromDate('10/Jul/71', yearLast=TRUE) # --> 9971 yearFromDate('10/Jul/1971', yearLast=TRUE) # --> 1971 yearFromDate('10/19/71', yearLast=TRUE) # --> 9971 yearFromDate('10/19/1969', yearLast=TRUE) # --> 1969 yearFromDate('10/1/71', yearLast=TRUE) # --> 9971 yearFromDate('3/22/71', yearLast=TRUE) # --> 9971 yearFromDate('3/2/71', yearLast=TRUE) # --> 9971 yearFromDate('10/1/1969', yearLast=TRUE) # --> 1969 yearFromDate('3/22/1969', yearLast=TRUE) # --> 1969 yearFromDate('3/2/1969', yearLast=TRUE) # --> 1969 yearFromDate('10 mmm 71', yearLast=TRUE) # "mmm" is month abbreviation--> 9971 yearFromDate('5 mmm 71', yearLast=TRUE) # "mmm" is month abbreviation--> 9971 yearFromDate('10 19 71', yearLast=TRUE) # --> 9971 yearFromDate('10 19 1969', yearLast=TRUE) # --> 1969 yearFromDate('10 1 71', yearLast=TRUE) # --> 9971 yearFromDate('3 22 71', yearLast=TRUE) # --> 9971 yearFromDate('3 2 71', yearLast=TRUE) # --> 9971 yearFromDate('10 1 1969', yearLast=TRUE) # --> 1969 yearFromDate('3 22 1969', yearLast=TRUE) # --> 1969 yearFromDate('3 2 1969', yearLast=TRUE) # --> 1969 yearFromDate('Oct. 19, 1969', yearLast=TRUE) # --> 1969 yearFromDate('19 October 1969', yearLast=TRUE) # --> 1969 yearFromDate('How you do dat?', yearLast=TRUE) # --> NA yearFromDate('2014-07-03', yearLast=TRUE) # --> 2014 yearFromDate('2014-7-03', yearLast=TRUE) # --> 2014 yearFromDate('2014-07-3', yearLast=TRUE) # --> 2014 yearFromDate('2014-7-3', yearLast=TRUE) # --> 2014 yearFromDate('2014/07/03', yearLast=TRUE) # --> 2014 yearFromDate('2014/7/03', yearLast=TRUE) # --> 2014 yearFromDate('2014/07/3', yearLast=TRUE) # --> 2014 yearFromDate('2014/7/3', yearLast=TRUE) # --> 2014 yearFromDate('2014 07 03', yearLast=TRUE) # --> 2014 yearFromDate('2014 7 03', yearLast=TRUE) # --> 2014 yearFromDate('2014 07 3', yearLast=TRUE) # --> 2014 yearFromDate('2014 7 3', yearLast=TRUE) # --> 2014 yearFromDate(1969, yearLast=FALSE) yearFromDate('10-Jul-71', yearLast=FALSE) # --> 9971 yearFromDate('10-Jul-1971', yearLast=FALSE) # --> 1971 yearFromDate('10-19-71', yearLast=FALSE) # --> 9910 yearFromDate('10-19-1969', yearLast=FALSE) # --> 1969 yearFromDate('10-1-71', yearLast=FALSE) # --> 9910 yearFromDate('3-22-71', yearLast=FALSE) # --> 9971 yearFromDate('3-2-71', yearLast=FALSE) # --> 9971 yearFromDate('10-1-1969', yearLast=FALSE) # --> 1969 yearFromDate('3-22-1969', yearLast=FALSE) # --> 1969 yearFromDate('3-2-1969', yearLast=FALSE) # --> 1969 yearFromDate('10/19/71', yearLast=FALSE) # --> 9910 yearFromDate('10/19/1969', yearLast=FALSE) # --> 1969 yearFromDate('10/1/71', yearLast=FALSE) # --> 9910 yearFromDate('3/22/71', yearLast=FALSE) # --> 9971 yearFromDate('3/2/71', yearLast=FALSE) # --> 9971 yearFromDate('10/1/1969', yearLast=FALSE) # --> 1969 yearFromDate('3/22/1969', yearLast=FALSE) # --> 1969 yearFromDate('3/2/1969', yearLast=FALSE) # --> 1969 yearFromDate('10 mmm 71', yearLast=FALSE) # "mmm" is month abbreviation--> 9971 yearFromDate('5 mmm 71', yearLast=FALSE) # "mmm" is month abbreviation--> 9971 yearFromDate('10 19 71', yearLast=FALSE) # --> 9910 yearFromDate('10 19 1969', yearLast=FALSE) # --> 1969 yearFromDate('10 1 71', yearLast=FALSE) # --> 9910 yearFromDate('3 22 71', yearLast=FALSE) # --> 9971 yearFromDate('3 2 71', yearLast=FALSE) # --> 9971 yearFromDate('10 1 1969', yearLast=FALSE) # --> 1969 yearFromDate('3 22 1969', yearLast=FALSE) # --> 1969 yearFromDate('3 2 1969', yearLast=FALSE) # --> 1969 yearFromDate('Oct. 19, 1969', yearLast=FALSE) # --> 1969 yearFromDate('19 October 1969', yearLast=FALSE) # --> 1969 yearFromDate('How you do dat?', yearLast=FALSE) # --> NA yearFromDate('2014-07-03', yearLast=FALSE) # --> 2014 yearFromDate('2014-7-03', yearLast=FALSE) # --> 2014 yearFromDate('2014-07-3', yearLast=FALSE) # --> 2014 yearFromDate('2014-7-3', yearLast=FALSE) # --> 2014 yearFromDate('2014/07/03', yearLast=FALSE) # --> 2014 yearFromDate('2014/7/03', yearLast=FALSE) # --> 2014 yearFromDate('2014/07/3', yearLast=FALSE) # --> 2014 yearFromDate('2014/7/3', yearLast=FALSE) # --> 2014 yearFromDate('2014 07 03', yearLast=FALSE) # --> 2014 yearFromDate('2014 7 03', yearLast=FALSE) # --> 2014 yearFromDate('2014 07 3', yearLast=FALSE) # --> 2014 yearFromDate('2014 7 3', yearLast=FALSE) # --> 2014
yearFromDate(1969, yearLast=TRUE) yearFromDate('10-Jul-71', yearLast=TRUE) # --> 9971 yearFromDate('10-Jul-1971', yearLast=TRUE) # --> 1971 yearFromDate('10-19-71', yearLast=TRUE) # --> 9971 yearFromDate('10-19-1969', yearLast=TRUE) # --> 1969 yearFromDate('10-1-71', yearLast=TRUE) # --> 9971 yearFromDate('3-22-71', yearLast=TRUE) # --> 9971 yearFromDate('3-2-71', yearLast=TRUE) # --> 9971 yearFromDate('10-1-1969', yearLast=TRUE) # --> 1969 yearFromDate('3-22-1969', yearLast=TRUE) # --> 1969 yearFromDate('3-2-1969', yearLast=TRUE) # --> 1969 yearFromDate('10/Jul/71', yearLast=TRUE) # --> 9971 yearFromDate('10/Jul/1971', yearLast=TRUE) # --> 1971 yearFromDate('10/19/71', yearLast=TRUE) # --> 9971 yearFromDate('10/19/1969', yearLast=TRUE) # --> 1969 yearFromDate('10/1/71', yearLast=TRUE) # --> 9971 yearFromDate('3/22/71', yearLast=TRUE) # --> 9971 yearFromDate('3/2/71', yearLast=TRUE) # --> 9971 yearFromDate('10/1/1969', yearLast=TRUE) # --> 1969 yearFromDate('3/22/1969', yearLast=TRUE) # --> 1969 yearFromDate('3/2/1969', yearLast=TRUE) # --> 1969 yearFromDate('10 mmm 71', yearLast=TRUE) # "mmm" is month abbreviation--> 9971 yearFromDate('5 mmm 71', yearLast=TRUE) # "mmm" is month abbreviation--> 9971 yearFromDate('10 19 71', yearLast=TRUE) # --> 9971 yearFromDate('10 19 1969', yearLast=TRUE) # --> 1969 yearFromDate('10 1 71', yearLast=TRUE) # --> 9971 yearFromDate('3 22 71', yearLast=TRUE) # --> 9971 yearFromDate('3 2 71', yearLast=TRUE) # --> 9971 yearFromDate('10 1 1969', yearLast=TRUE) # --> 1969 yearFromDate('3 22 1969', yearLast=TRUE) # --> 1969 yearFromDate('3 2 1969', yearLast=TRUE) # --> 1969 yearFromDate('Oct. 19, 1969', yearLast=TRUE) # --> 1969 yearFromDate('19 October 1969', yearLast=TRUE) # --> 1969 yearFromDate('How you do dat?', yearLast=TRUE) # --> NA yearFromDate('2014-07-03', yearLast=TRUE) # --> 2014 yearFromDate('2014-7-03', yearLast=TRUE) # --> 2014 yearFromDate('2014-07-3', yearLast=TRUE) # --> 2014 yearFromDate('2014-7-3', yearLast=TRUE) # --> 2014 yearFromDate('2014/07/03', yearLast=TRUE) # --> 2014 yearFromDate('2014/7/03', yearLast=TRUE) # --> 2014 yearFromDate('2014/07/3', yearLast=TRUE) # --> 2014 yearFromDate('2014/7/3', yearLast=TRUE) # --> 2014 yearFromDate('2014 07 03', yearLast=TRUE) # --> 2014 yearFromDate('2014 7 03', yearLast=TRUE) # --> 2014 yearFromDate('2014 07 3', yearLast=TRUE) # --> 2014 yearFromDate('2014 7 3', yearLast=TRUE) # --> 2014 yearFromDate(1969, yearLast=FALSE) yearFromDate('10-Jul-71', yearLast=FALSE) # --> 9971 yearFromDate('10-Jul-1971', yearLast=FALSE) # --> 1971 yearFromDate('10-19-71', yearLast=FALSE) # --> 9910 yearFromDate('10-19-1969', yearLast=FALSE) # --> 1969 yearFromDate('10-1-71', yearLast=FALSE) # --> 9910 yearFromDate('3-22-71', yearLast=FALSE) # --> 9971 yearFromDate('3-2-71', yearLast=FALSE) # --> 9971 yearFromDate('10-1-1969', yearLast=FALSE) # --> 1969 yearFromDate('3-22-1969', yearLast=FALSE) # --> 1969 yearFromDate('3-2-1969', yearLast=FALSE) # --> 1969 yearFromDate('10/19/71', yearLast=FALSE) # --> 9910 yearFromDate('10/19/1969', yearLast=FALSE) # --> 1969 yearFromDate('10/1/71', yearLast=FALSE) # --> 9910 yearFromDate('3/22/71', yearLast=FALSE) # --> 9971 yearFromDate('3/2/71', yearLast=FALSE) # --> 9971 yearFromDate('10/1/1969', yearLast=FALSE) # --> 1969 yearFromDate('3/22/1969', yearLast=FALSE) # --> 1969 yearFromDate('3/2/1969', yearLast=FALSE) # --> 1969 yearFromDate('10 mmm 71', yearLast=FALSE) # "mmm" is month abbreviation--> 9971 yearFromDate('5 mmm 71', yearLast=FALSE) # "mmm" is month abbreviation--> 9971 yearFromDate('10 19 71', yearLast=FALSE) # --> 9910 yearFromDate('10 19 1969', yearLast=FALSE) # --> 1969 yearFromDate('10 1 71', yearLast=FALSE) # --> 9910 yearFromDate('3 22 71', yearLast=FALSE) # --> 9971 yearFromDate('3 2 71', yearLast=FALSE) # --> 9971 yearFromDate('10 1 1969', yearLast=FALSE) # --> 1969 yearFromDate('3 22 1969', yearLast=FALSE) # --> 1969 yearFromDate('3 2 1969', yearLast=FALSE) # --> 1969 yearFromDate('Oct. 19, 1969', yearLast=FALSE) # --> 1969 yearFromDate('19 October 1969', yearLast=FALSE) # --> 1969 yearFromDate('How you do dat?', yearLast=FALSE) # --> NA yearFromDate('2014-07-03', yearLast=FALSE) # --> 2014 yearFromDate('2014-7-03', yearLast=FALSE) # --> 2014 yearFromDate('2014-07-3', yearLast=FALSE) # --> 2014 yearFromDate('2014-7-3', yearLast=FALSE) # --> 2014 yearFromDate('2014/07/03', yearLast=FALSE) # --> 2014 yearFromDate('2014/7/03', yearLast=FALSE) # --> 2014 yearFromDate('2014/07/3', yearLast=FALSE) # --> 2014 yearFromDate('2014/7/3', yearLast=FALSE) # --> 2014 yearFromDate('2014 07 03', yearLast=FALSE) # --> 2014 yearFromDate('2014 7 03', yearLast=FALSE) # --> 2014 yearFromDate('2014 07 3', yearLast=FALSE) # --> 2014 yearFromDate('2014 7 3', yearLast=FALSE) # --> 2014