I'm going through some linux kernel documentation on per cpu variables. I understand that having separate variable for each CPU help prevent cache line invalidation and make things faster.
But in multiprocessor system , a task can get scheduled in any processor. So cache and tlb invalidation or clean keep happening plethora of other kernel data.So how having few per cpu variables increase performance of the system unless those variables are going to be used very frequently?