I am implementing a script which compares bubble sort, insertion sort and selection sort.
Although all these have a time complexity of the order of N^2, selection sort is supposed to be twice as fast as bubble sort, but I am getting different results.
Here is my code and set up
#!/usr/bin/env python3
import random
import time
import copy
# timer function for sorting algorithms
def timeit(f):
def wrapper(*args, **kwargs):
st = time.time()
y =f( *args,**kwargs)
ft = time.time()
print(f.__name__,'\n', ft-st, 'seconds')
if y:
print("verifying...")
assert sorted(args[0]) == y, f'{f.__name__} did not sort the array correctly'
print(f'{f.__name__} sorted correctly\n\n')
return y
return wrapper
# function to sample unsorted array
def sample_array(n=10000, type=['unordered','ordered'][0]):
if type == 'unordered':
return list(random.choices( list(range(1,101)),k=n))
if type == 'ordered':
return sorted(list(random.choices( list(range(1,101)),k=n)))
@timeit
def bubble_sort(arr, ascending=True):
while True:
swapped = False
for idx in range(len(arr)):
try:
if arr[idx] > arr[idx+1]:
arr[idx+1] = arr[idx]
swapped = True
elif arr[idx] <= arr[idx+1]:
pass
except IndexError:
pass
if not swapped:
return arr
@timeit
def selection_sort(arr):
n = len(arr)
for i in range(n-1):
min_idx = i
for j in range(i+1, n):
if arr[min_idx] > arr[j]:
min_idx = j
if min_idx != i:
tmp = arr[i]
arr[i] = arr[min_idx]
arr[min_idx] = tmp
return arr
# comparing to python's sort algorithm
@timeit
def python_sort(arr):
return sorted(arr)
@timeit
def insertion_sort(arr):
n = len(arr)
for idx in range(1, n):
temp = arr[idx]
pos = idx - 1
while pos >= 0:
if arr[pos] > temp:
arr[pos+1] = arr[pos]
pos -= 1
else:
break
arr[pos+1] = temp
return arr
if __name__=="__main__":
x = sample_array()
for f in [bubble_sort,python_sort,selection_sort, insertion_sort]:
xc = copy.deepcopy(x)
f(xc)
Now I would expect the output to reflect the fact that although of the same order, quadratic in N, the number of elements to be sorted, the selection sort should be around twice as fast as bubble sort, since it takes half as many steps. But I get a different output, as follows
bubble_sort
0.0033044815063476562 seconds
verifying...
bubble_sort sorted correctly
python_sort
0.0008809566497802734 seconds
verifying...
python_sort sorted correctly
selection_sort
2.7145907878875732 seconds
verifying...
selection_sort sorted correctly
insertion_sort
3.0474853515625 seconds
verifying...
insertion_sort sorted correctly
where both selection_sort
and insertion_sort
take a lot longer than the bubble_sort
implementation and python's optimized sorting algorithm, when operating on the exact same unordered array.
What am I doing wrong?
I have made sure that each time an algorithm is called, it is called upon a deep copy of the original unordered array. But still, I am getting these unusual results. Can anyone see any obvious mistakes in the implementation?
Or are my result reflective of these algorithms actual performance?
Thank you for your help.
I am implementing a script which compares bubble sort, insertion sort and selection sort.
Although all these have a time complexity of the order of N^2, selection sort is supposed to be twice as fast as bubble sort, but I am getting different results.
Here is my code and set up
#!/usr/bin/env python3
import random
import time
import copy
# timer function for sorting algorithms
def timeit(f):
def wrapper(*args, **kwargs):
st = time.time()
y =f( *args,**kwargs)
ft = time.time()
print(f.__name__,'\n', ft-st, 'seconds')
if y:
print("verifying...")
assert sorted(args[0]) == y, f'{f.__name__} did not sort the array correctly'
print(f'{f.__name__} sorted correctly\n\n')
return y
return wrapper
# function to sample unsorted array
def sample_array(n=10000, type=['unordered','ordered'][0]):
if type == 'unordered':
return list(random.choices( list(range(1,101)),k=n))
if type == 'ordered':
return sorted(list(random.choices( list(range(1,101)),k=n)))
@timeit
def bubble_sort(arr, ascending=True):
while True:
swapped = False
for idx in range(len(arr)):
try:
if arr[idx] > arr[idx+1]:
arr[idx+1] = arr[idx]
swapped = True
elif arr[idx] <= arr[idx+1]:
pass
except IndexError:
pass
if not swapped:
return arr
@timeit
def selection_sort(arr):
n = len(arr)
for i in range(n-1):
min_idx = i
for j in range(i+1, n):
if arr[min_idx] > arr[j]:
min_idx = j
if min_idx != i:
tmp = arr[i]
arr[i] = arr[min_idx]
arr[min_idx] = tmp
return arr
# comparing to python's sort algorithm
@timeit
def python_sort(arr):
return sorted(arr)
@timeit
def insertion_sort(arr):
n = len(arr)
for idx in range(1, n):
temp = arr[idx]
pos = idx - 1
while pos >= 0:
if arr[pos] > temp:
arr[pos+1] = arr[pos]
pos -= 1
else:
break
arr[pos+1] = temp
return arr
if __name__=="__main__":
x = sample_array()
for f in [bubble_sort,python_sort,selection_sort, insertion_sort]:
xc = copy.deepcopy(x)
f(xc)
Now I would expect the output to reflect the fact that although of the same order, quadratic in N, the number of elements to be sorted, the selection sort should be around twice as fast as bubble sort, since it takes half as many steps. But I get a different output, as follows
bubble_sort
0.0033044815063476562 seconds
verifying...
bubble_sort sorted correctly
python_sort
0.0008809566497802734 seconds
verifying...
python_sort sorted correctly
selection_sort
2.7145907878875732 seconds
verifying...
selection_sort sorted correctly
insertion_sort
3.0474853515625 seconds
verifying...
insertion_sort sorted correctly
where both selection_sort
and insertion_sort
take a lot longer than the bubble_sort
implementation and python's optimized sorting algorithm, when operating on the exact same unordered array.
What am I doing wrong?
I have made sure that each time an algorithm is called, it is called upon a deep copy of the original unordered array. But still, I am getting these unusual results. Can anyone see any obvious mistakes in the implementation?
Or are my result reflective of these algorithms actual performance?
Thank you for your help.
Share Improve this question edited Feb 6 at 12:49 Dana Strong asked Feb 6 at 12:45 Dana StrongDana Strong 114 bronze badges New contributor Dana Strong is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.1 Answer
Reset to default 0The problem is that you have not implemented bubble sort correctly. It does produce a sorted list (and so it passes the test), but it doesn't (always) have all the values of the original list, since it overwrites values. So your test was not thorough enough.
The problem with the test
The test assumes that f
will not tamper with the values in the given list, but will only re-arrange them. If you really want to ensure that the same values appear in the sort-result, and not some other values, or values with different number of occurrences, then you should first determine the expected result before executing f
. So in wrapper
, insert this statement at the very top:
expected = sorted(args[0])
And compare with that in your assert
statement:
assert expected == y, f'{f.__name__} did not sort the array correctly'
With this modification, you'll get the following assertion error:
bubble_sort did not sort the array correctly
The problem with bubble_sort
This statement is wrong:
arr[idx+1] = arr[idx]
With this statement you duplicate the value at index idx
at the cost of the value at its right side.
It should have been a swap:
arr[idx], arr[idx+1] = arr[idx+1], arr[idx]
Now the timings will be more in line with expectations.
But there are several other remarks to be made:
Other Remarks
1. Bubble sort
The following statement does nothing useful:
elif arr[idx] <= arr[idx+1]:
pass
This is making a comparison, and whatever the result, nothing will be done. So just remove those two lines of code.
Instead of making an iteration with idx
equal to the last index in the list, and trapping the error that will occur with:
try:
...
except IndexError:
pass
...just make one iteration less (with range(len(arr)-1)
). Then there is no more error either, and you don't need that try...catch
of code anymore.
Your function declaration allows for an ascending
argument, but it is never used. So drop it, as none of the other sorting functions have that parameter.
As bubble sort grows a sorted partition at the right side of the list, you can stop the inner loop sooner the more iterations the outer loop makes, and so it will be more interesting to use a for
loop for the outer loop as well:
def bubble_sort(arr):
for end in range(len(arr) - 1, 0, -1):
swapped = False
for idx in range(end):
if arr[idx] > arr[idx+1]:
arr[idx], arr[idx+1] = arr[idx+1],arr[idx]
swapped = True
if not swapped:
return arr
2. Selection sort
Here you have performed the swap in a non-pythonic way. Just do it with tuple assignment:
arr[i], arr[min_idx] = arr[min_idx], arr[i]
3. Insertion sort
Instead of performing the shifts one by one, you could first identify the position where the next value must be inserted (without moving any values yet), and then perform the rotation with slicing:
def insertion_sort(arr):
for idx in range(1, len(arr)):
temp = arr[idx]
for pos in range(idx, 0, -1):
if arr[pos-1] <= temp:
break
else:
pos = 0
arr[pos+1: idx+1] = arr[pos:idx]
arr[pos] = temp
return arr