I have an RGB image and want to identify 'white pixels' or those which have a value of '255' in each channel. Next I'd like to replace all of those values with '0' to convert white pixels into black pixels:
img = array(c(100,255), dim = c(1800,1800, 3)) #dummy image (edited on request: original image was completely white)
rgbSum = rowSums(img, dims = 2) #sum of RGB channels
idx = which(rgbSum == 765, arr.ind = TRUE) #765 equals 255 in each channel equals white pixel
img[idx[,1],idx[,2],1:3] = 0 #this step takes super long
The last line of this code takes very long (> 2 hours). So long that eventually I gave up or my system crashed. Is there a faster way to replace the specific values?
I have an RGB image and want to identify 'white pixels' or those which have a value of '255' in each channel. Next I'd like to replace all of those values with '0' to convert white pixels into black pixels:
img = array(c(100,255), dim = c(1800,1800, 3)) #dummy image (edited on request: original image was completely white)
rgbSum = rowSums(img, dims = 2) #sum of RGB channels
idx = which(rgbSum == 765, arr.ind = TRUE) #765 equals 255 in each channel equals white pixel
img[idx[,1],idx[,2],1:3] = 0 #this step takes super long
The last line of this code takes very long (> 2 hours). So long that eventually I gave up or my system crashed. Is there a faster way to replace the specific values?
Share edited Feb 28 at 14:41 mri asked Feb 20 at 12:33 mrimri 6002 silver badges14 bronze badges 5 |3 Answers
Reset to default 7This approach works a little faster
img = array(255, dim = c(1800,1800, 3))
img[img[,,1] == 255 & img[,,2] == 255 & img[,,3] == 255] <- 0
grid::grid.raster(img / 255)
In
Out
If you really want to speed things up
Use a RCPP function
set_zero.cpp
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector replaceWhitePixels(NumericVector img) {
IntegerVector dims = img.attr("dim");
int height = dims[0];
int width = dims[1];
int channels = dims[2];
if (channels != 3) {
stop("Image must have exactly 3 color channels.");
}
NumericVector img_copy = clone(img);
// Iterate over pixels
for (int j = 0; j < width; ++j) {
for (int i = 0; i < height; ++i) {
int index = i + height * j;
// Check if the pixel is white (255,255,255)
if (img_copy[index] == 255.0 &&
img_copy[index + height * width] == 255.0 &&
img_copy[index + 2 * height * width] == 255.0) {
// Set to black (0,0,0)
img_copy[index] = 0.0;
img_copy[index + height * width] = 0.0;
img_copy[index + 2 * height * width] = 0.0;
}
}
}
img_copy.attr("dim") = dims; // Restore the 3D shape
return img_copy;
}
Use set_zero.cpp and benchmark
library(Rcpp)
sourceCpp("set_zero.cpp") # source cpp function
img <- replaceWhitePixels(img) # use rcpp function
microbenchmark::microbenchmark(
x1 = {img[img[,,1] == 255 & img[,,2] == 255 & img[,,3] == 255] <- 0},
x2 = {img[(img[,,1] + img[,,2] + img[,,3]) == 765] <- 0},
x3 = {img[rowSums(img == 255, dims = 2) == 3] <- 0},
x4 = {img[Reduce(`&`, lapply(1:3, function(i) img[,,i] == 255))] <- 0},
x5 = {replaceWhitePixels(img)},
times = 50
)
Unit: milliseconds
expr min lq mean median uq max neval cld
x1 84.1338 84.7783 93.01965 91.64740 99.8094 109.7169 50 a
x2 63.7948 64.4846 70.78049 68.52470 74.2659 100.0960 50 b
x3 82.7356 83.2619 88.98157 89.10510 90.0160 107.6304 50 c
x4 84.3123 85.9636 93.37466 91.43025 99.0053 110.1644 50 a
x5 11.3669 11.7678 14.87864 12.11920 15.4708 30.3509 50 d
MemAlloc does also seem to be finy with this approach
bench::mark(
+ x5 = {replaceWhitePixels(img)}, # Store output
+ iterations = 50
+ )
# A tibble: 1 × 13
expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time gc
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <list> <list>
1 x5 11.4ms 15.1ms 69.5 74.2MB 11.3 43 7 619ms <dbl [1,800 × 1,800 × 3]> <Rprofmem [1 × 3]> <bench_tm> <tibble>
As you can see, x5 is much faster than anything else...
On request, we can benchmark x5 on a grey-striped image
img = array(c(100,255), dim = c(1800,1800, 3)) # partly grey and white
grid::grid.raster(img / 255)
library(Rcpp)
sourceCpp("set_zero.cpp")
img2 <- replaceWhitePixels(img)
grid::grid.raster(img2 / 255)
Summing directly seems to be about 20% faster and use almost half the memory than the previous approach (which was great in speeding things up from the original!), likely because it only has to evaluate one sum equality, not three equalities:
img[(img[,,1] + img[,,2] + img[,,3]) == 765] <- 0
Speed/memory tests:
microbenchmark::microbenchmark(
x1 = img[img[,,1] == 255 & img[,,2] == 255 & img[,,3] == 255] <- 0,
x2 = img[(img[,,1] + img[,,2] + img[,,3]) == 765] <- 0
)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# x1 86.78284 106.9815 119.2120 111.82070 119.37555 373.0943 100 a
# x2 69.89446 78.1248 104.9138 91.63681 99.91959 376.2333 100 a
bench::mark(
x1 = img[img[,,1] == 255 & img[,,2] == 255 & img[,,3] == 255] <- 0,
x2 = img[(img[,,1] + img[,,2] + img[,,3]) == 765] <- 0
)
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time
# <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <list>
# 1 x1 82.4ms 82.4ms 12.1 136MB 48.5 1 4 82.4ms <dbl> <Rprofmem> <bench_tm>
# 2 x2 76.1ms 77.5ms 12.6 86.6MB 12.6 3 3 238.2ms <dbl> <Rprofmem> <bench_tm>
In your specific case, probably you can simply run
img * array(rep(rowSums(img, dims = 2) != 765, 3), dim(img))
Benchmark
With @Tim G's and @jpsmith's solutions
img0 <- array(255, dim = c(1800, 1800, 3))
microbenchmark(
f_IIC = {
img * array(rep(rowSums(img, dims = 2) != 765, 3), dim(img))
},
f_jpsmith = {
img[(img[, , 1] + img[, , 2] + img[, , 3]) == 765] <- 0
img
},
f_TimG = {
img[img[, , 1] == 255 & img[, , 2] == 255 & img[, , 3] == 255] <- 0
img
},
check = "equal",
times = 50,
unit = "relative",
setup = {
img <- img0
}
)
gives
Unit: relative
expr min lq mean median uq max neval
f_IIC 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 50
f_jpsmith 1.144069 1.164050 1.190964 1.178859 1.194696 1.149946 50
f_TimG 1.370778 1.412908 1.405420 1.400339 1.365923 1.248986 50
255L
instead of255
will half your memory usage because it will use a 32-bit integer instead of a 64-bit real, and probably make it a little faster too. How do you use your image? You can actually store all three RGB values in a single integer asR * 2^16 + G * 2^8 + B
. The code then becomeswhich(img == 16777215L, 0L)
which uses another factor3
less memory and gives a4x
speedup compared to the accepted pure R solution on my system. The downside is that functions may not accept it in this format. – asdfldsfdfjjfddjf Commented Feb 23 at 10:11