10.2 前缀树应用

Kesa...大约 10 分钟

前缀树主要用于字符串查找, 时间复杂度O(k), k为字符串长度.

和哈希表不同, 哈希表需要完整的字符串才能查找, 而前缀树只需要知道前缀就可进行查找.

10.2.1 问题63: 替换单词

在英语中，有一个叫做 词根(root) 的概念，它可以跟着其他一些词组成另一个较长的单词——我们称这个词为 继承词(successor)。例如，词根an，跟随着单词 other(其他)，可以形成新的单词 another(另一个)。
现在，给定一个由许多词根组成的词典和一个句子，需要将句子中的所有继承词用词根替换掉。如果继承词有许多可以形成它的词根，则用最短的词根替换它。
需要输出替换之后的句子。
示例 1：
输入：dictionary = ["cat","bat","rat"], sentence = "the cattle was rattled by the battery"
输出："the cat was rat by the bat"
示例 2：
输入：dictionary = ["a","b","c"], sentence = "aadsfasf absbs bbab cadsfafs"
输出："a a b c"
示例 3：
输入：dictionary = ["a", "aa", "aaa", "aaaa"], sentence = "a aa a aaaa aaa aaa aaa aaaaaa bbb baba ababa"
输出："a a a a a a a a bbb baba a"
示例 4：
输入：dictionary = ["catt","cat","bat","rat"], sentence = "the cattle was rattled by the battery"
输出："the cat was rat by the bat"
示例 5：
输入：dictionary = ["ac","ab"], sentence = "it is abnormal that this solution is accepted"
输出："it is ab that this solution is ac"
提示：
1 <= dictionary.length <= 1000
1 <= dictionary[i].length <= 100
dictionary[i] 仅由小写字母组成。
1 <= sentence.length <= 10^6
sentence 仅由小写字母和空格组成。
sentence 中单词的总量在范围 [1, 1000] 内。
sentence 中每个单词的长度在范围 [1, 1000] 内。
sentence 中单词之间由一个空格隔开。
sentence 没有前导或尾随空格。

10.2.1.1 分析&题解

使用输入的字典构造前缀树
遍历字符串的单词, 判断单词前缀是否在前缀树中, 若是, 则用前缀替换
返回新的字符串

func replaceWords(dictionary []string, sentence string) string {
	// 构建前缀树
	trie := buildTrie(dictionary)

	// 替换单词
	var sb strings.Builder
	sb.Grow(len(sentence))
	words := strings.Split(sentence, " ")
	for i, w := range words {
		// 寻找前缀
		prefix := findPrefix(trie, w)
		if prefix != "" {
			w = prefix
		}

		sb.WriteString(w)
		if i != len(words)-1 {
			sb.WriteByte(' ')
		}
	}

	return sb.String()
}

func findPrefix(root *Trie, w string) string {
	prefix := ""

	var sb strings.Builder
	sb.Grow(len(w))
	cur := root
	for _, c := range w {
		ch := c
		c -= 'a'
		// 已找到或者不存在
		if cur.isWord || cur.children[c] == nil {
			break
		}
		sb.WriteRune(ch)

		cur = cur.children[c]
	}

	// 已找到
	if cur.isWord {
		prefix = sb.String()
	}

	return prefix
}

func buildTrie(dic []string) *Trie {
	root := &Trie{}
	for _, w := range dic {
		root.Insert(w)
	}

	return root
}

// Trie
type Trie struct {
	children [26]*Trie
	isWord   bool
}

func (t *Trie) Insert(w string) {
	cur := t
	for _, c := range w {
		c -= 'a'
		if cur.children[c] == nil {
			cur.children[c] = &Trie{}
		}

		cur = cur.children[c]
	}

	cur.isWord = true
}

10.2.2 问题64: 神奇字典

LCR 064. 实现一个魔法字典open in new window

设计一个使用单词列表进行初始化的数据结构，单词列表中的单词 互不相同 。如果给出一个单词，请判定能否只将这个单词中一个字母换成另一个字母，使得所形成的新单词存在于已构建的神奇字典中。
实现 MagicDictionary 类：
MagicDictionary() 初始化对象
void buildDict(String[] dictionary) 使用字符串数组 dictionary 设定该数据结构，dictionary 中的字符串互不相同
bool search(String searchWord) 给定一个字符串 searchWord ，判定能否只将字符串中一个字母换成另一个字母，使得所形成的新字符串能够与字典中的任一字符串匹配。如果可以，返回 true ；否则，返回 false 。
示例：
输入
inputs = ["MagicDictionary", "buildDict", "search", "search", "search", "search"]
inputs = [[], [["hello", "leetcode"]], ["hello"], ["hhllo"], ["hell"], ["leetcoded"]]
输出
[null, null, false, true, false, false]

解释
MagicDictionary magicDictionary = new MagicDictionary();
magicDictionary.buildDict(["hello", "leetcode"]);
magicDictionary.search("hello"); // 返回 False
magicDictionary.search("hhllo"); // 将第二个 'h' 替换为 'e' 可以匹配 "hello" ，所以返回 True
magicDictionary.search("hell"); // 返回 False
magicDictionary.search("leetcoded"); // 返回 False
提示：
1 <= dictionary.length <= 100
1 <= dictionary[i].length <= 100
dictionary[i] 仅由小写英文字母组成
dictionary 中的所有字符串 互不相同
1 <= searchWord.length <= 100
searchWord 仅由小写英文字母组成
buildDict 仅在 search 之前调用一次
最多调用 100 次 search

10.2.2.1 分析&题解

构建字典树
深度优先搜索:
1. 递归函数 dfs(node, pos, modified) boolnode: 当前节点 pos: 带查询的第pos个字符 modified: 是否修改过
2. 若node 的子节点和pos的字符相同则继续递归
3. 2)不满足的话, 判断是否修改过, 没有则将任意子节点替换后, 以modified为true继续递归
4. 已结束查找, 已修改过一次字符后和字典中单词相同(modified == true && isWord == true)
5. 上述步骤均不满足, 返回false

type MagicDictionary struct {
	trie *Trie
}

func Constructor() MagicDictionary {
	return MagicDictionary{trie: &Trie{}}
}

func (md *MagicDictionary) BuildDict(dictionary []string) {
	for _, w := range dictionary {
		md.trie.Insert(w)
	}
}

func (md *MagicDictionary) Search(searchWord string) bool {
	return dfs(md.trie, searchWord, false)
}

func dfs(node *Trie, s string, modified bool) bool {
	// 查找结束
	if s == "" {
		return node.isWord && modified
	}

	// 深度优先
	// 当前节点相同则查找子节点
	c := s[0] - 'a'
	if node.children[c] != nil && dfs(node.children[c], s[1:], modified) {
		return true
	}

	// 未修改则用任意子节点替换, 继续查找
	if !modified {
		for i, child := range node.children {
			if i != int(c) && child != nil && dfs(child, s[1:], true) {
				return true
			}
		}
	}

	return false
}

type Trie struct {
	children [26]*Trie
	isWord   bool
}

func (t *Trie) Insert(w string) {
	cur := t
	for _, c := range w {
		c -= 'a'
		if cur.children[c] == nil {
			cur.children[c] = &Trie{}
		}

		cur = cur.children[c]
	}

	cur.isWord = true
}

10.2.3 问题65: 最短的单词编码

LCR 065. 单词的压缩编码open in new window

单词数组 words 的 有效编码 由任意助记字符串 s 和下标数组 indices 组成，且满足：
words.length == indices.length
助记字符串 s 以 '#' 字符结尾
对于每个下标 indices[i] ，s 的一个从 indices[i] 开始、到下一个 '#' 字符结束（但不包括 '#'）的 子字符串 恰好与 words[i] 相等
给定一个单词数组 words ，返回成功对 words 进行编码的最小助记字符串 s 的长度。
示例 1：
输入：words = ["time", "me", "bell"]
输出：10
解释：一组有效编码为 s = "time#bell#" 和 indices = [0, 2, 5] 。
words[0] = "time" ，s 开始于 indices[0] = 0 到下一个 '#' 结束的子字符串，如加粗部分所示 "time#bell#"
words[1] = "me" ，s 开始于 indices[1] = 2 到下一个 '#' 结束的子字符串，如加粗部分所示 "time#bell#"
words[2] = "bell" ，s 开始于 indices[2] = 5 到下一个 '#' 结束的子字符串，如加粗部分所示 "time#bell#"
示例 2：
输入：words = ["t"]
输出：2
解释：一组有效编码为 s = "t#" 和 indices = [0] 。
提示：
1 <= words.length <= 2000
1 <= words[i].length <= 7
words[i] 仅由小写字母组成

10.2.3.1 分析&题解

如果一个单词是另一个单词的后缀(me, time), 那么在助记字符串中只需出现一个(time).

若将两个字符串反转, 那么反转后的单词一定是另一个的前缀(em, emit). 此时可以使用前缀树存储单词.

由于助记词中单词以#结尾, 那么助记词的长度为前缀树根节点到叶节点路径长度加一.

可以使用深度优先遍历, 计算路径长度.

构建前缀树, 将单词反向存入前缀树
深度优先遍历前缀树, 计算根节点到叶节点的路径长度和.

func minimumLengthEncoding(words []string) int {
    // 构建前缀树
    reTrie := buildReTrie(words)

    // 深度优先遍历
    // 路径长度和
    sum := 0
    var dfs func(*ReTrie, int)
    dfs = func(node *ReTrie, length int) {
        isLeaf := true
        for _, child := range node.children {
            if child != nil {
                isLeaf = false
                // 遍历子节点
                dfs(child, length+1)
            }
        }

        // 到达叶节点, 统计路径长度
        if isLeaf {
            sum += length
        }
    }
    // 单词后多一个'#'
    dfs(reTrie, 1)

    return sum
}

func buildReTrie(words []string) *ReTrie {
    root := &ReTrie{}
    for _, w := range words {
        root.Insert(w)
    }
    return root
}

type ReTrie struct {
    children [26]*ReTrie
}

func (rt *ReTrie) Insert(w string) {
    cur := rt
    for i := len(w)-1; i >= 0; i-- {
        c := w[i] - 'a'
        if cur.children[c] == nil {
            cur.children[c] = &ReTrie{}
        }

        cur = cur.children[c]
    }
}

10.2.4 问题66: 单词之和

LCR 066. 键值映射open in new window

实现一个 MapSum 类，支持两个方法，insert 和 sum：
MapSum() 初始化 MapSum 对象
void insert(String key, int val) 插入 key-val 键值对，字符串表示键 key ，整数表示值 val 。如果键 key 已经存在，那么原来的键值对将被替代成新的键值对。
int sum(string prefix) 返回所有以该前缀 prefix 开头的键 key 的值的总和。
示例：
输入：
inputs = ["MapSum", "insert", "sum", "insert", "sum"]
inputs = [[], ["apple", 3], ["ap"], ["app", 2], ["ap"]]
输出：
[null, null, 3, null, 5]

解释：
MapSum mapSum = new MapSum();
mapSum.insert("apple", 3);  
mapSum.sum("ap");           // return 3 (apple = 3)
mapSum.insert("app", 2);    
mapSum.sum("ap");           // return 5 (apple + app = 3 + 2 = 5)
提示：
1 <= key.length, prefix.length <= 50
key 和 prefix 仅由小写英文字母组成
1 <= val <= 1000
最多调用 50 次 insert 和 sum

10.2.4.1 分析&题解

构建前缀树, 若某个节点代表单词key的结尾那么可以将此节点的值设置为value.

对于前缀prefix, 查找到prefix在前缀树中的最后一个节点, 那么此节点的任意子树都是符合条件的, 将所有的节点值累加即可.

构建前缀树, 记录key, value
深度优先遍历前缀树:
1. 找到前缀prefix所在的最后一个节点, 不在直接返回0.
2. 递归统计节点的所有子树的节点值之和

type MapSum struct {
    trie *MapTrie
}

func Constructor() MapSum {
    return MapSum{trie: &MapTrie{}}
}


func (ms *MapSum) Insert(key string, val int)  {
    ms.trie.Insert(key, val)
}


func (ms *MapSum) Sum(prefix string) int {
    cur := ms.trie
    // find last node in trie of prefix 
    for _, ch := range prefix {
        c := ch - 'a'
        // not found
        if cur.children[c] == nil {
            return 0
        }

        cur = cur.children[c]
    }

    // calculate val in all subtree
    var dfsSum func(*MapTrie) int 
    dfsSum = func(node *MapTrie) int {
        sum := 0
        if node == nil {
            return sum
        }

        // current node 
        sum += node.val
        // subtree
        for _, child := range node.children {
            if child != nil {
                sum += dfsSum(child)
            }
        }

        return sum
    }

    return dfsSum(cur)
}

type MapTrie struct {
    children [26]*MapTrie
    val int
}

func (mt *MapTrie) Insert(w string, val int) {
    cur := mt
    for _, ch := range w {
        c := ch - 'a'
        if cur.children[c] == nil {
            cur.children[c] = &MapTrie{}
        }
        
        cur = cur.children[c]
    }

    cur.val = val
}

10.2.5 问题67: 最大异或

LCR 067. 数组中两个数的最大异或值open in new window

给定一个整数数组 nums ，返回 nums[i] XOR nums[j] 的最大运算结果，其中 0 ≤ i ≤ j < n 。
示例 1：
输入：nums = [3,10,5,25,2,8]
输出：28
解释：最大运算结果是 5 XOR 25 = 28.
示例 2：
输入：nums = [0]
输出：0
示例 3：
输入：nums = [2,4]
输出：6
示例 4：
输入：nums = [8,10,2]
输出：10
示例 5：
输入：nums = [14,70,53,83,49,91,36,80,92,51,66,70]
输出：127
提示：
1 <= nums.length <= 2 * 10^5
0 <= nums[i] <= 2^31 - 1
**进阶：**你可以在 O(n) 的时间解决这个问题吗？

10.2.5.1 分析&题解

若使用暴力解法, 时间复杂度为 $O(n^2)$ .

异或运算, 二进制位不同则为1. 那么对于一个数字, 要想其异或的结果尽量大, 应该选择二进制位相反的数来计算.

构建前缀树, 将整数二进制位存入树中
遍历数组, 对于数字:
1. 寻找在前缀树中二进制位尽量相反的节点
2. 进行异或运算, 更新最大值

时间复杂度: O(n), 异或运算中的双重循环, 二进制位数是常量视作O(1)

func findMaximumXOR(nums []int) int {
    // 构建前缀树
    trie := buildBitTrie(nums)

    // 计算最大异或值
    maxXOR := 0
    for _, n := range nums {
        cur := trie
        xor := 0
        for i := 31; i >=0; i-- {
            bit := (n >> i) & 1
            if cur.children[1-bit] != nil { // 相反bit
                xor = (xor << 1) + 1 // 异或结果 1
                cur = cur.children[1-bit]
            } else { // 相同 bit
                xor <<= 1
                cur = cur.children[bit]
            }
        }
        
        maxXOR = max(maxXOR, xor)
    }

    return maxXOR
}

func buildBitTrie(nums []int) *BitTrie {
    root := &BitTrie{}
    for _, n := range nums {
        root.Insert(int32(n))
    }
    return root
}

func max(a, b int) int {
    if a > b {
        return a
    }
    return b
}

type BitTrie struct {
    children [2]*BitTrie
}

func (bt *BitTrie) Insert(n int32) {
    cur := bt
    for i := 31; i >= 0; i-- {
        bit := (n >> i) & 1
        if cur.children[bit] == nil {
            cur.children[bit] = &BitTrie{}
        }
        
        cur = cur.children[bit]
    }
}

Reference

剑指Offer（专项突破版）open in new window

昵称

邮箱

网址

按正序
按倒序
按热度

10.2 前缀树应用

# 10.2.1 问题63: 替换单词

# 10.2.1.1 分析&题解

# 10.2.2 问题64: 神奇字典

# 10.2.2.1 分析&题解

# 10.2.3 问题65: 最短的单词编码

# 10.2.3.1 分析&题解

# 10.2.4 问题66: 单词之和

# 10.2.4.1 分析&题解

# 10.2.5 问题67: 最大异或

# 10.2.5.1 分析&题解

# Reference

预览: