sorting - Why is the cutoff value to insertion sort for small sub-arrays in optimizing quicksort algorithm is system-dependent?

In pp.296 Sedgewick & et al.'s Algorithm, 4rd edition, the author wrote:

The optimum value of the cutoff M is system-dependent, but any value between 5 and 15 is likely to work well in most situations.

But I don't understand what it means by the cutoff value is system-dependent because the performance of an algorithm is measured is based on the number of operations it performs not and independent of the speed of the computer processor?

In pp.296 Sedgewick & et al.'s Algorithm, 4rd edition, the author wrote:

The optimum value of the cutoff M is system-dependent, but any value between 5 and 15 is likely to work well in most situations.

Share Improve this question asked Jan 20 at 7:05 Kt Student 1591 silver badge4 bronze badges

1 I haven't read the book, so I can't say what the author is aiming at, but optimizing for small values by definition is not related to big O notation. Big O is about asymptotic complexity, i.e., behavior for large inputs. When optimizing for small values, it's all about the constants, which are machine-dependent. – Vincent van der Weele Commented Jan 20 at 7:27
@VincentvanderWeele correct me if I'm wrong but I think the notation is related to the performance measurement because the optimization of the algorithm in this case because when sorting the whole array is divided into MANY small sub-arrays where the optimization begins? – Kt Student Commented Jan 20 at 7:34
2 Yeah, that's why I'm not sure what point the author is trying to make. Thing is, you can choose any constant M and say that the problem for subarrays shorter than M is O(1). This doesn't impact the analysis of the whole algorithm, it stays O(n log n). The definition of O(n log n) is that there exists a constant C such that the runtime is less that C * n log n. The choice of M does have an impact on the constant C. – Vincent van der Weele Commented Jan 20 at 8:04
As @VincentvanderWeele said it, for small containers, it's machine-dependent (and also data-dependent...) and you can't ignore the constants around the O(...) value. To give you a degenerated example: you shouldn't invoke quicksort to sort an array of TWO numbers that never grow... Because depending on your platform, you can even sort it in only one CPU instruction (test and swap), so a O(1) complexity. – Wisblade Commented Jan 20 at 9:43
Due to cache being faster than ram, the range on systems in the last 10 years is more like 16 to 64. – rcgldr Commented Jan 20 at 11:03

| Show 2 more comments

1 Answer 1

Sorted by: Reset to default 1

I have a 2nd Edition, from 1984. On page 112, under the heading "Small Subfiles" in a discussion of Quicksort:

The second improvement stems from the observation that a recursive program is guaranteed to call itself for many small subfiles, so it should be changed to use a better method when small subfiles are encountered. One obvious way to do this is to change the test at the beginning of the recursive routine from if r>l then to a call on insertion sort (modified to accept parameters defining the subfile to be sorted), that is if r-l <= M then insertion(l,r). Here, M is some parameter whose exact value depends upon the implementation. The value chosen for M need not be the best possible: the algorithm works about the same for M in the range from about 5 to about 25. The reduction in the running time is on the order of 20% for most applications.

There are a couple of points here.

You're right: the performance of an algorithm is measured based on the number of operations it performs, not on the speed of the processor.

As I explained in my answer to a similar question, Quicksort is a hugely complicated algorithm when compared to Insertion sort. There is considerable bookkeeping overhead involved. That is, there are certain fixed costs with Quicksort, regardless of how large or small the subarray you're sorting. As the array gets smaller, the percentage of time spent in overhead increases.

Insertion sort is a very simple sorting algorithm. There is very little bookkeeping overhead. Insertion sort can be faster than Quicksort for small arrays because Insertion sort actually performs fewer operations than Quicksort does when the arrays are very small. The variation (as the text says, from 5 to 25) depends on the exact algorithm implementations.

Sedgewick's book, as with many other algorithms texts, often blurs the line between the theoretical and the practical. I think it's good to keep the practical in mind, but either the author should make it clear when he's talking about actual performance and theoretical performance, or the instructor should clarify that in class.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

sorting - Why is the cutoff value to insertion sort for small sub-arrays in optimizing quicksort algorithm is system-dependent?

1 Answer 1

与本文相关的文章

评论列表(0)