CTR的堆

Posted on 2022-08-01 Edited on 2026-03-23

我们在学习选择排序的时候，每次选择最小的一个数，将其放到最前面，就可以在 $O(n^2)$ 内完成排序。

我们不满足于这样的复杂度，我们想要更快地找到最小的数。

找最小的数有很多方法，你可以用平衡树或者线段树做到 $\log n$ 维护并查找最小值。但是这些算法太高级了， CTR 先不学。

所以我们找到了二叉堆。

我们介绍小根堆（维护最小值）。

堆是一颗完全二叉树，每个节点都有一个权值和一个编号。

为了方便，我们记编号为 $i$ 的节点的权值为 $w[i]$ ，它的爸爸的编号是 $fa[i]$ ，它的左儿子的编号是 $ls[i]$ ，它的右儿子的编号叫 $rs[i]$ 。

堆规定：

对于所有不是根节点的节点，它的爸爸的权值比它的权值小。
没了。

那么我们知道了什么？根节点就是堆的权值最小的节点！

为了方便，我作以下规定，对于所有 $i$ :

$ls[i] = 2\cdot i,\ rs[i] = 2\cdot i + 1$

那么相应的，有：

$fa[i] = \lfloor\frac{i}{2}\rfloor$

想想为什么。

但是我们知道了规矩没有用，我们得想想怎么去维护它。

首先，想想我们要支持哪些操作。

没错，我们需要一个删除操作和一个插入操作。

插入

首先想想怎么插入。

我们可以先将新的数插入列表的末尾。

但是要是新的数比它的父亲小，那就需要调整树的形态。我们这样调整:

如果当前节点的权值比父节点的权值要小，那么交换当前节点和父节点；否则跳出循环。
到父节点的位置继续如上操作，直到当前位置为根节点。

想想为什么这样调整能够维护堆性质。

给你两张图理解一下。可以自己画：

有这么一个堆：

CTRSB

你把它往上交换：

CTRSZ

由于堆的高度是 $O(\log n)$ 的，所以插入操作是 $O(\log n)$ 的。

给你个代码：

def Push(k):
    global tot
    global heap
    heap.append(k)
    tot += 1
    i = tot
    while i > 1:
        if heap[i // 2] > heap[i]:
            heap[i // 2], heap[i] = heap[i], heap[i // 2]
            i //= 2
        else:
            break
    return

global 是用来引用全局变量的，不用管他。

删除

删除操作要稍微复杂一点。

分为三步：

将根节点和编号最大的节点交换位置。
从根节点开始，找到两个子节点的较小节点，如果该子节点比当前节点小，那么交换当前节点和该子节点的位置；否则结束循环。
如果交换了，那么到该子节点重复以上判断。

图就不放了，会了插入，删除其实也差不多。

时间复杂度同样是 $O(\log n)$ 的。

def Pop():
    global tot
    global heap
    heap[tot], heap[1] = heap[1], heap[tot]
    heap.pop()
    tot -= 1
    i = 1
    while i * 2 < tot:
        minn = i * 2
        if heap[minn] > heap[i * 2 + 1]:
            minn += 1
        if heap[minn] < heap[i]:
            heap[i], heap[minn] = heap[minn], heap[i]
            i = minn
        else:
            break
    if i * 2 == tot and heap[tot] < heap[i]:
        heap[tot], heap[i] = heap[i], heap[tot]
    return

那么如何查询呢？输出堆顶元素即可，时间复杂度 $O(1)$ 。

完整代码：


# heap_sort start-----------------------------------------------------------------------------------

heap = [0]
tot = 0

def Push(k):
    global tot
    global heap
    heap.append(k)
    tot += 1
    i = tot
    while i > 1:
        if heap[i // 2] > heap[i]:
            heap[i // 2], heap[i] = heap[i], heap[i // 2]
            i //= 2
        else:
            break
    return

def Top():
    global heap
    return heap[1]

def Pop():
    global tot
    global heap
    heap[tot], heap[1] = heap[1], heap[tot]
    heap.pop()
    tot -= 1
    i = 1
    while i * 2 < tot:
        minn = i * 2
        if heap[minn] > heap[i * 2 + 1]:
            minn += 1
        if heap[minn] < heap[i]:
            heap[i], heap[minn] = heap[minn], heap[i]
            i = minn
        else:
            break
    if i * 2 == tot and heap[tot] < heap[i]:
        heap[tot], heap[i] = heap[i], heap[tot]
    return

def heap_sort(lis):
    for i in lis:
        Push(i)
    noww = len(lis)
    for i in range(noww):
        lis[i] = Top()
        Pop()
    return

# heap_sort end-------------------------------------------------------------------------

if __name__ == '__main__':
    lis = [100000 - i for i in range(100000)]
    print("The original array is:")
    print(lis)
    heap_sort(lis)
    print("The heap_sorted array is:")
    print(lis)

结语

二叉堆是一个很优秀的数据结构，空间复杂度 $O(n)$ ，单次操作时间复杂度 $O(\log n)$ ，还具有相当优秀的常数。

堆排序并不是堆的所有用途，学会了堆，你能做的远远不止排序。