For loops are perhaps more intuitive than sapply because the result you get is the same as if you ran the code within the for loop multiple times. What do I mean?
> x = numeric(10)
> y = numeric(10)
> z = for(i in 1:10) {
+ y[i] = i
+ x[i] = i*2
+ x[i]
+ }
> x
[1] 2 4 6 8 10 12 14 16 18 20
> y
[1] 1 2 3 4 5 6 7 8 9 10
> z
NULL
So the code within the for loop actually changes what is stored in x and y, but it does not return anything itself. Thus, z is NULL.
Let's use very similar code, except using sapply:
> x = numeric(10)
> y = numeric(10)
> z = sapply(1:10, function(i){
+ y[i] = i
+ x[i] = i*2
+ x[i]
+ })
> x
[1] 0 0 0 0 0 0 0 0 0 0
> y
[1] 0 0 0 0 0 0 0 0 0 0
> z
[1] 2 4 6 8 10 12 14 16 18 20
Wait, why are x and y still full of 0s? This occurs because any assignments made within an sapply does not affect the global environment. So changing y[i] = i within sapply does not change the vector y itself. Thus it stays a vector of 0s as it was initialized. The trouble with sapply is that because of this, one iteration of the loop cannot depend on a different iteration of the loop--i.e., we cannot calculate x based off of what x was in a previous iteration. This is in direct contrast to a for loop, where because the changes happen in the global environment, we can use a previous iteration to determine the current iteration, like in this example:
> x = numeric(10)
> y = numeric(10)
> z = for(i in 2:10) {
+ y[i] = i
+ x[i] = x[i-1]+y[i-1]
+ }
> x
[1] 0 0 2 5 9 14 20 27 35 44
> y
[1] 0 2 3 4 5 6 7 8 9 10
> z
NULL
No comments:
Post a Comment