I want to create a new column with a running sum of every value greater than 0. I have a dataframe:
df=data.frame(year=c('2007-04-01','2007-04-02','2007-04-03','2007-04-04','2007-04-05','2007-04-06'),air.temp=c(1,2,-1,3,1,0)
and I want to create:
df=data.frame(year=c('2007-04-01','2007-04-02','2007-04-03','2007-04-04','2007-04-05','2007-04-06'),air.temp=c(1,2,-1,3,1,0),temp.sum=c(1,3,3,6,7,7))
So far I have tried:
df$temp.sum <- if_else(df$air.temp > 0, cumsum(df$air.temp), 0)
Which resulted in
temp.sum=c(1,3,0,5,6,0))
How do I not count values at or below 0, without changing the running sum? My dataset is 100,000+ observations, so simple suggestions are helpful!
I want to create a new column with a running sum of every value greater than 0. I have a dataframe:
df=data.frame(year=c('2007-04-01','2007-04-02','2007-04-03','2007-04-04','2007-04-05','2007-04-06'),air.temp=c(1,2,-1,3,1,0)
and I want to create:
df=data.frame(year=c('2007-04-01','2007-04-02','2007-04-03','2007-04-04','2007-04-05','2007-04-06'),air.temp=c(1,2,-1,3,1,0),temp.sum=c(1,3,3,6,7,7))
So far I have tried:
df$temp.sum <- if_else(df$air.temp > 0, cumsum(df$air.temp), 0)
Which resulted in
temp.sum=c(1,3,0,5,6,0))
How do I not count values at or below 0, without changing the running sum? My dataset is 100,000+ observations, so simple suggestions are helpful!
Share Improve this question asked Feb 14 at 3:39 savertonssavertons 231 silver badge3 bronze badges3 Answers
Reset to default 5Use a parallel maximum to make negative values 0, then continue to do the cumulative sum.
cumsum(pmax(df$air.temp, 0))
#[1] 1 3 3 6 7 7
Seems very quick on 1.2M values:
x <- rep(df$air.temp, 2e5)
system.time(cumsum(pmax(x, 0)))
## user system elapsed
## 0 0 0
Another solution would be:
cumsum(df$air.temp*(df$air.temp > 0))
[1] 1 3 3 6 7 7
If you are using if_else/ifelse
:
cumsum(if_else(df$air.temp > 0,df$air.temp, 0))
[1] 1 3 3 6 7 7
This answer uses the most brute-force while
loop. The result is in summation
.
a=c(1,2,-1,3,1,0)
n=length(a)
i=1
sum=0
summation=0
while(i<=n){
if (a[i]>=0){
sum=sum+a[i]
}
summation[i]=sum
i=i+1
}