采用随机函数随机生成十、百、万、百万4个量级的数据集,分别采用大纲要求的8个算法进行排序;对各个数据集(排序前后共8个数据集)进行指定关键字的顺序和折半查找;对算法运行时间进行统计分析,建议采用图、表来展示实验结果。
时间: 2023-11-27 16:51:24 浏览: 105
数据结构实验:八个排序算法的实现与比较
好的,这是一个比较复杂的任务,需要编写代码来实现。以下是一个示例代码,可以生成4个不同量级的随机数据集并使用8个算法进行排序,然后进行关键字的顺序和折半查找,并统计算法运行时间,并使用图表展示结果。
```python
import random
import time
import matplotlib.pyplot as plt
# 生成随机数据集
def generate_dataset(n):
return [random.randint(0, 1000000) for i in range(n)]
# 冒泡排序
def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(n-i-1):
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
return arr
# 选择排序
def selection_sort(arr):
n = len(arr)
for i in range(n):
min_index = i
for j in range(i+1, n):
if arr[j] < arr[min_index]:
min_index = j
arr[i], arr[min_index] = arr[min_index], arr[i]
return arr
# 插入排序
def insertion_sort(arr):
n = len(arr)
for i in range(1, n):
key = arr[i]
j = i - 1
while j >= 0 and arr[j] > key:
arr[j+1] = arr[j]
j -= 1
arr[j+1] = key
return arr
# 希尔排序
def shell_sort(arr):
n = len(arr)
gap = n // 2
while gap > 0:
for i in range(gap, n):
key = arr[i]
j = i - gap
while j >= 0 and arr[j] > key:
arr[j+gap] = arr[j]
j -= gap
arr[j+gap] = key
gap = gap // 2
return arr
# 归并排序
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left, right):
result = []
i = 0
j = 0
while i < len(left) and j < len(right):
if left[i] < right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result += left[i:]
result += right[j:]
return result
# 快速排序
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr)//2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + middle + quick_sort(right)
# 堆排序
def heap_sort(arr):
n = len(arr)
for i in range(n//2-1, -1, -1):
heapify(arr, n, i)
for i in range(n-1, 0, -1):
arr[0], arr[i] = arr[i], arr[0]
heapify(arr, i, 0)
return arr
def heapify(arr, n, i):
largest = i
l = 2*i + 1
r = 2*i + 2
if l < n and arr[l] > arr[largest]:
largest = l
if r < n and arr[r] > arr[largest]:
largest = r
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest)
# 计数排序
def counting_sort(arr):
n = len(arr)
output = [0] * n
count = [0] * 1000001
for i in range(n):
count[arr[i]] += 1
for i in range(1, 1000001):
count[i] += count[i-1]
for i in range(n-1, -1, -1):
output[count[arr[i]]-1] = arr[i]
count[arr[i]] -= 1
return output
# 生成4个不同量级的随机数据集
datasets = []
for n in [10, 100, 1000, 1000000]:
datasets.append(generate_dataset(n))
# 8个排序算法
sort_algorithms = [
("Bubble Sort", bubble_sort),
("Selection Sort", selection_sort),
("Insertion Sort", insertion_sort),
("Shell Sort", shell_sort),
("Merge Sort", merge_sort),
("Quick Sort", quick_sort),
("Heap Sort", heap_sort),
("Counting Sort", counting_sort)
]
# 对每个数据集进行排序并记录时间
sorting_times = []
for dataset in datasets:
times = []
for name, algorithm in sort_algorithms:
start_time = time.time()
sorted_arr = algorithm(dataset)
end_time = time.time()
time_taken = end_time - start_time
times.append(time_taken)
sorting_times.append(times)
# 对排序后的数据集进行顺序和折半查找并记录时间
searching_times = []
for sorted_dataset in sorting_times:
times = []
for i in range(10):
key = random.randint(0, 1000000)
start_time = time.time()
index = sorted_dataset[0].index(key)
end_time = time.time()
time_taken = end_time - start_time
times.append(time_taken)
start_time = time.time()
index = binary_search(sorted_dataset[0], key)
end_time = time.time()
time_taken = end_time - start_time
times.append(time_taken)
searching_times.append(times)
# 计算平均时间
avg_sorting_times = [sum(x)/len(x) for x in sorting_times]
avg_searching_times = [sum(x)/len(x) for x in searching_times]
# 绘制图表
plt.figure(figsize=(10, 6))
# 排序时间
plt.subplot(2, 1, 1)
plt.plot([10, 100, 1000, 1000000], avg_sorting_times, '-o')
plt.title("Sorting Time")
plt.xlabel("Data Size")
plt.ylabel("Time (s)")
# 查找时间
plt.subplot(2, 1, 2)
plt.plot([10, 100, 1000, 1000000], avg_searching_times, '-o')
plt.title("Searching Time")
plt.xlabel("Data Size")
plt.ylabel("Time (s)")
plt.tight_layout()
plt.show()
```
这个代码会生成4个不同量级的随机数据集,然后使用8个排序算法进行排序,并记录每个算法的运行时间。然后对每个排序后的数据集进行顺序和折半查找,并记录每个算法的运行时间。最后计算平均时间,并使用图表展示实验结果。
阅读全文