最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - Looking for a more efficient way to replace matrix elements - Stack Overflow

programmeradmin1浏览0评论

I have an RGB image and want to identify 'white pixels' or those which have a value of '255' in each channel. Next I'd like to replace all of those values with '0' to convert white pixels into black pixels:

img = array(c(100,255), dim = c(1800,1800, 3)) #dummy image (edited on request: original image was completely white)
rgbSum = rowSums(img, dims = 2) #sum of RGB channels
idx = which(rgbSum == 765, arr.ind = TRUE) #765 equals 255 in each channel equals white pixel
img[idx[,1],idx[,2],1:3] = 0 #this step takes super long

The last line of this code takes very long (> 2 hours). So long that eventually I gave up or my system crashed. Is there a faster way to replace the specific values?

I have an RGB image and want to identify 'white pixels' or those which have a value of '255' in each channel. Next I'd like to replace all of those values with '0' to convert white pixels into black pixels:

img = array(c(100,255), dim = c(1800,1800, 3)) #dummy image (edited on request: original image was completely white)
rgbSum = rowSums(img, dims = 2) #sum of RGB channels
idx = which(rgbSum == 765, arr.ind = TRUE) #765 equals 255 in each channel equals white pixel
img[idx[,1],idx[,2],1:3] = 0 #this step takes super long

The last line of this code takes very long (> 2 hours). So long that eventually I gave up or my system crashed. Is there a faster way to replace the specific values?

Share edited Feb 28 at 14:41 mri asked Feb 20 at 12:33 mrimri 6002 silver badges14 bronze badges 5
  • 1 Using 255L instead of 255 will half your memory usage because it will use a 32-bit integer instead of a 64-bit real, and probably make it a little faster too. How do you use your image? You can actually store all three RGB values in a single integer as R * 2^16 + G * 2^8 + B. The code then becomes which(img == 16777215L, 0L) which uses another factor 3 less memory and gives a 4x speedup compared to the accepted pure R solution on my system. The downside is that functions may not accept it in this format. – asdfldsfdfjjfddjf Commented Feb 23 at 10:11
  • How long is 'very long' and how fast is acceptably 'faster'? Questions on Stack Overflow need to be specific and objectively answerable. There's no way anyone can do anything other than guess without metrics to aim for. – TylerH Commented Feb 25 at 17:45
  • 1 @TylerH honestly, I don't know how long my initial approach takes because at some point I simply gave up or my program crashed. I didn't have any specific requirement to what is 'acceptably faster'. I don't need an implementation that works instantaneously, I simply wanted a somewhat faster solution so that I won't die from caffeine overdose. – mri Commented Feb 28 at 11:40
  • 1 @mri your original question was great, reproducible, and objectively answerable (1 second is objectively faster than two seconds). Your edit (“reasonably fast”) is much more subjective. I would suggest you roll back to the original post, but if you feel compelled to edit, you may want to put some sort of benchmark in there. The rccp approach in the accepted answer is probably going to be the one that requires the least coffee regardless though! – jpsmith Commented Feb 28 at 12:04
  • 1 @jpsmith thanks for your feedback. I've rolled back to the previous version. I tried to benchmark my original code but after 2 hours the last line was still not done computing. – mri Commented Feb 28 at 14:44
Add a comment  | 

3 Answers 3

Reset to default 7

This approach works a little faster

img = array(255, dim = c(1800,1800, 3)) 

img[img[,,1] == 255 & img[,,2] == 255 & img[,,3] == 255] <- 0

grid::grid.raster(img / 255)

In

Out

If you really want to speed things up

Use a RCPP function

set_zero.cpp

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector replaceWhitePixels(NumericVector img) {
  IntegerVector dims = img.attr("dim"); 
  int height = dims[0];
  int width = dims[1];
  int channels = dims[2];
  
  if (channels != 3) {
    stop("Image must have exactly 3 color channels.");
  }
  NumericVector img_copy = clone(img);  
  
  // Iterate over pixels
  for (int j = 0; j < width; ++j) {
    for (int i = 0; i < height; ++i) {
      int index = i + height * j;
      
      // Check if the pixel is white (255,255,255)
      if (img_copy[index] == 255.0 &&
          img_copy[index + height * width] == 255.0 &&
          img_copy[index + 2 * height * width] == 255.0) {
        
        // Set to black (0,0,0)
        img_copy[index] = 0.0;
        img_copy[index + height * width] = 0.0;
        img_copy[index + 2 * height * width] = 0.0;
      }
    }
  }
  
  img_copy.attr("dim") = dims;  // Restore the 3D shape
  return img_copy;
}

Use set_zero.cpp and benchmark

library(Rcpp)
sourceCpp("set_zero.cpp") # source cpp function    
img <- replaceWhitePixels(img) # use rcpp function

microbenchmark::microbenchmark(
  x1 = {img[img[,,1] == 255 & img[,,2] == 255 & img[,,3] == 255] <- 0},
  x2 = {img[(img[,,1] + img[,,2] + img[,,3]) == 765] <- 0},
  x3 = {img[rowSums(img == 255, dims = 2) == 3] <- 0},
  x4 = {img[Reduce(`&`, lapply(1:3, function(i) img[,,i] == 255))] <- 0},
  x5 = {replaceWhitePixels(img)},
  times = 50
)

Unit: milliseconds
 expr     min      lq     mean   median      uq      max neval  cld
   x1 84.1338 84.7783 93.01965 91.64740 99.8094 109.7169    50 a   
   x2 63.7948 64.4846 70.78049 68.52470 74.2659 100.0960    50  b  
   x3 82.7356 83.2619 88.98157 89.10510 90.0160 107.6304    50   c 
   x4 84.3123 85.9636 93.37466 91.43025 99.0053 110.1644    50 a   
   x5 11.3669 11.7678 14.87864 12.11920 15.4708  30.3509    50    d

MemAlloc does also seem to be finy with this approach

bench::mark(
+   x5 = {replaceWhitePixels(img)},  # Store output
+   iterations = 50
+ )
# A tibble: 1 × 13
  expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result                    memory             time       gc      
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>                    <list>             <list>     <list>  
1 x5           11.4ms   15.1ms      69.5    74.2MB     11.3    43     7      619ms <dbl [1,800 × 1,800 × 3]> <Rprofmem [1 × 3]> <bench_tm> <tibble>

As you can see, x5 is much faster than anything else...


On request, we can benchmark x5 on a grey-striped image

img = array(c(100,255), dim = c(1800,1800, 3)) # partly grey and white
grid::grid.raster(img / 255) 

library(Rcpp)
sourceCpp("set_zero.cpp")
img2 <- replaceWhitePixels(img)
grid::grid.raster(img2 / 255)

Summing directly seems to be about 20% faster and use almost half the memory than the previous approach (which was great in speeding things up from the original!), likely because it only has to evaluate one sum equality, not three equalities:

img[(img[,,1] + img[,,2] + img[,,3]) == 765] <- 0

Speed/memory tests:

microbenchmark::microbenchmark(
  x1 = img[img[,,1] == 255 & img[,,2] == 255 & img[,,3] == 255] <- 0,
  x2 = img[(img[,,1] + img[,,2] + img[,,3]) == 765] <- 0
)

# Unit: milliseconds
# expr      min       lq     mean    median        uq      max neval cld
#   x1 86.78284 106.9815 119.2120 111.82070 119.37555 373.0943   100   a
#   x2 69.89446  78.1248 104.9138  91.63681  99.91959 376.2333   100   a


bench::mark(
  x1 = img[img[,,1] == 255 & img[,,2] == 255 & img[,,3] == 255] <- 0,
  x2 = img[(img[,,1] + img[,,2] + img[,,3]) == 765] <- 0
)
#   expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory     time      
#   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list>     <list>    
#   1 x1           82.4ms   82.4ms      12.1     136MB     48.5     1     4     82.4ms <dbl>  <Rprofmem> <bench_tm>
#   2 x2           76.1ms   77.5ms      12.6    86.6MB     12.6     3     3    238.2ms <dbl>  <Rprofmem> <bench_tm>

In your specific case, probably you can simply run

img * array(rep(rowSums(img, dims = 2) != 765, 3), dim(img))

Benchmark

With @Tim G's and @jpsmith's solutions

img0 <- array(255, dim = c(1800, 1800, 3)) 

microbenchmark(
    f_IIC = {
        img * array(rep(rowSums(img, dims = 2) != 765, 3), dim(img))
    },
    f_jpsmith = {
        img[(img[, , 1] + img[, , 2] + img[, , 3]) == 765] <- 0
        img
    },
    f_TimG = {
        img[img[, , 1] == 255 & img[, , 2] == 255 & img[, , 3] == 255] <- 0
        img
    },
    check = "equal",
    times = 50,
    unit = "relative",
    setup = {
        img <- img0
    }
)

gives

Unit: relative
      expr      min       lq     mean   median       uq      max neval
     f_IIC 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000    50
 f_jpsmith 1.144069 1.164050 1.190964 1.178859 1.194696 1.149946    50
    f_TimG 1.370778 1.412908 1.405420 1.400339 1.365923 1.248986    50
发布评论

评论列表(0)

  1. 暂无评论