I'm having a problem when creating matrices in R. The final number of rows does not match the one specified. I extract the number of rows by the sum of a simple numeric vector but the result is always one row less. See the example.
# Basic objects to reproduce the issue
m <- 50
p <- 1:2
subset.min <- 2
subset.max <- 2
# The numeric vector
bin <- factorial(m)/(factorial(p) * factorial(m - p))
bin
# Desired number of lines
nRow <- sum(bin[subset.min:subset.max])
nRow
# The matrix should have 1225 lines, but it only has 1224 lines
m1 <- matrix(NA, nrow = nRow, ncol = 1)
dim(m1)
# This way it works
m2 <- matrix(NA, nrow = 1225, ncol = 1)
dim(m2)
Any idea what might be going on?
R version 4.4.2 - macOS Monterey (12.7.6)
I'm having a problem when creating matrices in R. The final number of rows does not match the one specified. I extract the number of rows by the sum of a simple numeric vector but the result is always one row less. See the example.
# Basic objects to reproduce the issue
m <- 50
p <- 1:2
subset.min <- 2
subset.max <- 2
# The numeric vector
bin <- factorial(m)/(factorial(p) * factorial(m - p))
bin
# Desired number of lines
nRow <- sum(bin[subset.min:subset.max])
nRow
# The matrix should have 1225 lines, but it only has 1224 lines
m1 <- matrix(NA, nrow = nRow, ncol = 1)
dim(m1)
# This way it works
m2 <- matrix(NA, nrow = 1225, ncol = 1)
dim(m2)
Any idea what might be going on?
R version 4.4.2 - macOS Monterey (12.7.6)
Share Improve this question asked 22 hours ago Vanderlei DebastianiVanderlei Debastiani 811 silver badge3 bronze badges1 Answer
Reset to default 6You can see that nRow
is actually numeric
, instead of integer
, and most importantly, they are showing different values, e.g.,
> str(nRow)
num 1225
> as.integer(nRow)
[1] 1224
or as suggested by SamR (see the comments)
> sprintf("%.20f", nRow)
[1] "1224.99999999998544808477"
when you attempted to create a matrix with nRow
rows, it takes an integer value to allocate the matrix size, and thus you have
> dim(m1)
[1] 1224 1
The precision loss comes from the use of factorial
. You can avoid above apply choose
instead to avoid above problem
> # The numeric vector
> bin <- choose(m, p)
> bin
[1] 50 1225
> nRow <- sum(bin[subset.min:subset.max])
> nRow
[1] 1225
> m1 <- matrix(1, nrow = nRow, 1)
> dim(m1)
[1] 1225 1
where the precision loss can be observed via
> as.integer(choose(m, p)) - factorial(m) / (factorial(p) * factorial(m - p))
[1] 5.897505e-13 1.455192e-11
Anyway, the best thing you should do it to make sure that integers are passed to matrix
to define the size, and don't place you full faith on numeric
values in the occasions where integers are needed.