This article explains the solution to LeetCode's hard-rated Word Ladder II problem. The code is written in Python and the algorithm used is Breadth First Search (BFS).

Introduction
Explanation
- Solution Steps
  - Step 1
  - Step 2
  - Final Code
Complexity explanation
Conclusion

Introduction

Several months ago, we came across the first part of this question, LC 127: Word Ladder I on LeetCode and it was an interesting one to tackle. We didn’t know there was a second part until fairly recently, so we were especially eager to take a shot at it. Our shot was successful (thankfully 😄) and I’ll walk you through our solution. I’ve included a brief description of the problem, but if like me, you enjoy reading problem statements and trying to understand them, do check it out at LC 126: Word Ladder II. Oh and if you’re wondering why Part II (LC 126) of the problem is indexed before Part 1 (LC 127), well I’m also wondering the same.

Key takeaways

Problem Summary

Given a beginning word, an ending word, and a word list, return all the shortest transformation sequences that can transform the beginning word into the ending word using only words in the word list. A transformation sequence is one such that adjacent words in it differ only by a single letter. E.g. Tap -> Top -> Pop is a valid transformation sequence because successive words differ by a single letter.

Say N is the number of words in the word list and M is the average length of each word:

Time → O(N^2* M)
Space → O(N * M)
Languages → [Python]
Approach → Breadth-First Search

Explanation

As we mentioned in the introduction, it might serve you best to read through the problem statement which you’d find here; [LC 126: Word Ladder II] but below is an overview of the problem.

We are given a beginWord, an endWord and a wordList as inputs. The first two are strings, and the last is a list of strings. We are asked to find the all the shortest possible transformation sequences that will transform the beginWord into the endWord using only words in the wordList (I really feel this should’ve been called wordBank, but…). If there is no such transformation sequence, then we are to return an empty list.

Below is a definition of some terms so that their purposes are clear in the context of this question;

transformation sequence: A sequence of words from from beginWord to endWord such that all the words in the sequence are present in the wordList and successive words differ only by a single letter.
beginWord: the word that begins our transformation sequence.
endWord: the word that should end our transformation sequence.
wordList: an array of words we are allowed to use to construct the transformation sequences
step: the unit of transformation from one word to another. It is always in the context of the wordList and it’s the number of words we need to go through to reach the end of a particular sequence.

For example, say

beginWord = hit
endWord = pot
wordList = [hot, pot]

Then hit -> hot -> pot is a valid transformation sequence as we only need to change one letter to go from one word to the next. It takes one step to go from hit to hot and two steps to go from hit to pot. Since there is no other way to go from hit to pot, then the aforementioned sequence is the answer. The shortest transformation sequence, therefore, requires two steps. And the answer to this problem would be returned as a list of the words in the format, [["hit", "hot", "pot"]].

Note that we could also have multiple valid transformation sequence that require the shortest number of steps. If in the example above, the word pit was added to the wordList, then another valid transformation sequence requiring only two steps would be hit -> pit -> pot. The answer would therefore be [["hit", "hot", "pot"], ["hit", "pit", "pot"]].

Now that we understand what the question asks of us, let’s get it solved. Often times, when a problem requires you to find the quickest way(s) to move from one state to another, it helps to construe it as a shortest distance problem. One of the algorithms that excels at finding shortest distances is the Breadth First Search, aka BFS and that’s what I employed in slaying LeetCode’s Word Search problems. Although this solution is specifically for Word Ladder II, it is very easily adaptable for use in solving Word Ladder I and I will leave a note somewhere in this post detailing what needs to be changed to that effect. While Word Ladder II asks for a list of all the shortest transformation sequences, Word Ladder I merely asks for the length of the shortest transformation sequence.

To model the problem such that it’s solvable with BFS, we must find a way to easily know and retrieve words that are adjacent each other i.e. they have an edge between them. A word, A is adjacent to another word, B if A can be transformed to B by changing only one letter. For example, the word hit and hot are adjacent to each other and have an edge between them because they differ by only one letter. This knowledge---what word is adjacent to another---will help us navigating the shortest path from the beginWord to the endWord.

To achieve this, we create an adjacency list of sorts in a pretty cool way. We group words together in a list if they share a pattern. A pattern is simply a possible point of difference a word could have with an adjacent word. The definition is probably not helpful enough and I feel an example would be more useful. So here goes;

the word, hit has three patterns, which are *it, h*t, and hi*, where * represents a possible point of difference that hit could have with an adjacent word
the third pattern, hi* , for example, suggests that hit would be adjacent to words that begin with hi and end with any other letter. Such word could be hip, him, his, hid, and so forth. These listed words are, therefore, one step away from hit
similarly, the words hot, hat, hut are one step away from hit because they conform to the h*t pattern. The same applies to pit, fit, sit for conforming to the *it pattern.
you can rightly tell that the number of patterns a word conforms to is equivalent to the length of the word as it is representative of the number of points of differences possessed by the word.

And so our pseudo-adjacency list is born---we create a dictionary whose keys are the patterns, and the values are a list of words in the wordList that conform to that pattern.

Next, we must begin our search for the Holy Grail---I mean for the shortest transformation sequences, and here is where we actually make use of BFS. Because it’s a sweeping algorithm, i.e. it combs through a graph layer-by-layer, or in our case, step-by-step, once we encounter the first shortest transformation sequence, we can be assured that the other sequences, if they exist, would immediately follow in that same step. And we can return our answer once we’re done with that step.

We make use of a queue in the BFS and we append new (unvisited) words to it. However, because the entire transformation sequence is to be returned as part of the answer, we need to track the word transformations that have previously occurred. Therefore, not only are appending new words to the queue, we are also including the word transformations that have brought us to the new word. We can achieve this in two ways:

store the index of the words in the wordList that have been involved in the transformation from the beginWord to the endWord and use this index to reconstruct the sequence for the final answer
store the words in the wordList that have been involved in the transformation from the beginWord to the endWord, so no reconstruction is necessary as you already have the words you need.

In our solution, we chose the first option (storing the index) but the second option should be just as effective.

One more important thing to note---you can be rest assured that if you encounter a new word for the first time while traversing a particular level (step) of the BFS, then you would not need to consider that word in a later step because you’ve previously found a shorter route to get to that word. However, if you encounter a word multiple times in the same step, each encounter would need to be considered because you’ve essentially found multiple routes to arrive at that word in the same number of steps. To take care of these considerations, we make use of two sets which you would see later in the code. One is a global visited set that contains words that have been seen throughout the entire BFS. The other is a local visiting set that contains words that have been seen only during the traversal of a particular step in the BFS. At the end of each step, the elements of the visiting set are added to the visited set and the visiting set is emptied at the beginning of the traversal of the next step.

Now let’s see some code, and other extra details!

Solution Steps

Step 1 → Build a dictionary of patterns and the list of words they conform to, and map the words to their corresponding index
Step 2 → Begin the BFS

Step 1

Here we build a dictionary where the keys are the patterns and the values are words that conform to that key. As mentioned earlier, note that a word of length, m has m patterns and will be found in the list of each of those patterns.

1def findLadders(self, beginWord: str, endWord: str, wordList: list[str]) -> list[list[str]]:
2		
3        if endWord not in wordList:    
4            return []
5            
6        word_to_index = collections.defaultdict(int)
7        pattern_bank = collections.defaultdict(list)
8        curr_index = 0
9        for word in wordList:
10            for i in range(len(word)):
11                pattern = word[0:i] + "*" + word[i+1:]
12                pattern_bank[pattern].append(word)
13            word_to_index[word] = curr_index
14            curr_index += 1
15           
16        def construct_sequence(index_list):
17            sequence = [beginWord]
18            for i in index_list:
19                sequence.append(wordList[i])
20            return sequence

In the code above, first we account for cases where the endWord is not in the wordList. In such cases, we are assured that there is no transformation that could take us to the endWord so we can simply return an empty list.

Next, we create word_to_index to store our word-to-index mappings, and pattern_bank for the patterns and their associated word(s). We populate both dictionaries by iterating through the wordList array.

Lastly, we define a function, construct_sequence that constructs a transformation sequence given the indices of the words involved in the sequence.

Step 2

Here, we initialise the BFS and begin the hunt for the shortest transformation sequences. We add the beginWord to the visited set and it will continue to store words that we’ve visited. We also initialise the queue with two things,

the beginWord;
an empty array

We’d always append a pair of elements to the queue in that manner while traversing our search space—a new word to visit, and an array containing the indices of the words that have so far been involved in creating the sequence. We initialise step to track the length of the transformation sequence, i.e. the level the BFS is at. And we initialise an empty result list, which would hold the final answer.

Note that the step variable is not needed to solve this problem, but it’s useful for LC 127: Word Ladder I as that is what you are required to return.

1def findLadders(self, beginWord: str, endWord: str, wordList: list[str]) -> list[list[str]]:
2
3#...continued from step 1 above 
4
5		visited = set(beginWord)
6        queue = collections.deque([(beginWord, [])])
7        step = 0
8        result = []
9        # print(pattern_bank)
10        # return []
11
12        while queue:
13            visiting = set()
14            for i in range(len(queue)):
15                word, index_list = queue.popleft()
16
17                if word == endWord:
18                    result.append(construct_sequence(index_list))
19                    continue
20
21                for i in range(len(word)):
22                    pattern = word[0:i] + "*" + word[i+1:]
23                    for next_word in pattern_bank[pattern]:
24                        if next_word != word and next_word not in visited:
25                            index_list.append(word_to_index[next_word])
26                            queue.append((next_word, index_list[:]))
27                            index_list.pop()
28                            visiting.add(next_word)
29                            
30            if len(result) > 0:
31                return result
32            visited = visited.union(visiting)
33
34            step += 1
35        return []

We have a visiting set which is used to store the words we’ve only just encountered at a particular level. It is important that we do this, as opposed to just adding each newly encountered word to the visited set because it is possible we take the same number of steps to arrive at a particular word through different routes. So for example, we can go from hit -> hot -> pot as well as from hit -> pit -> pot. We need to capture both instances because we arrived at the destination pot through different routes but in the same number of steps. Both are valid sequences, and must be included in the solution. Adding pot to the visited set the first time we visit it will mean that it won’t get added the second time we are sure to encounter it, and that’s not desired. Using a visiting set therefore allows us to keep track of words we’ve seen at a particular level, and we only add them to the visited set once we have exhausted all the words at that level.

When we pop a word, index_list pair from the left of the queue, we check if the word is the endWord. If it is, then that current step/level marks the shortest transformation sequence required. We construct the sequence of words that led up to that word, and append it to result. Then we move the iteration forward to check if there are other sequences that culminate in the endWord. If there are any, we repeat the process of reconstructing the sequence and appending it to the result. Also, each time we exhaust the traversal of a step/level, we check if result is empty. If it is empty it means that we’ve not had any luck encountering the endWord and we continue exploring the items in the queue, if they exist. If result isn’t empty, then it means we’ve stumbled on the endWord in the exploration of the last level, and are certain that all the sequences that lead up to the endWord at that level have been added to result. The value of steps in that last level represents the length of the shortest transformation sequence we need, so we can safely return result as the answer. If we empty the queue without ever hitting the return statement, then it means that there is no transformation sequence from the beginWord to the endWord and we can return an empty list.

Within the BFS itself, if the word we pop from the queue doesn’t equal the endWord, we try to add the neighbours of that word to the queue. We do this by

retrieving all the patterns that the word conforms to
for each pattern, we go through the list of words associated with it and add new words to the queue if they haven’t been added to the visited before.
before appending a new word to the queue, we get it’s index in the wordList, and append that index to the index_list. We append the new word, and the index_list to the queue, as earlier discussed. Note that we add a copy of this index_list and not the index_list itself because Python lists are mutable.
before moving to the next iteration, we pop out the newly added index from the index_list because we don’t want it to interfere with the next neighbours of the word we’re visiting.

And that’s it! Below is the final code;

Final Code

1class Solution:
2    def findLadders(self, beginWord: str, endWord: str, wordList: list[str]) -> list[list[str]]:
3
4            if endWord not in wordList:    
5                return []
6
7            word_to_index = collections.defaultdict(int)
8            pattern_bank = collections.defaultdict(list)
9            curr_index = 0
10            for word in wordList:
11                for i in range(len(word)):
12                    pattern = word[0:i] + "*" + word[i+1:]
13                    pattern_bank[pattern].append(word)
14                word_to_index[word] = curr_index
15                curr_index += 1
16
17            def construct_sequence(index_list):
18                sequence = [beginWord]
19                for i in index_list:
20                    sequence.append(wordList[i])
21                return sequence
22
23            visited = set(beginWord)
24            queue = collections.deque([(beginWord, [])])
25            step = 0
26            result = []
27            # print(pattern_bank)
28            # return []
29
30            while queue:
31                visiting = set()
32                for i in range(len(queue)):
33                    word, index_list = queue.popleft()
34
35                    if word == endWord:
36                        result.append(construct_sequence(index_list))
37                        continue
38
39                    for i in range(len(word)):
40                        pattern = word[0:i] + "*" + word[i+1:]
41                        for next_word in pattern_bank[pattern]:
42                            if next_word != word and next_word not in visited:
43                                index_list.append(word_to_index[next_word])
44                                queue.append((next_word, index_list[:]))
45                                index_list.pop()
46                                visiting.add(next_word)
47
48                if len(result) > 0:
49                    return result
50                visited = visited.union(visiting)
51
52                step += 1
53            return []

Complexity explanation

For this problem, the complexity is a bit tricky, but we’d try to propose a reasonable upper bound for it. In the explanation N and M represent the length of the wordList and the average length of each word respectively.

Pattern Bank and Word Sequence Each word has a total of M patterns, so we would perform N x M operations while constructing the pattern_bank. This also accomodates the creation of the word_to_index dictionary, as it is done in that same pass, and is an O(N) operation. Time complexity here is, therefore, O(N*M).

With regards to space, the size of the pattern_bank is the predominant term, and the number of keys in that dictionary will not exceed N x M. Space complexity here is, therefore, O(N*M).

Breadth First Search For an undirected graph with N nodes, the number of visits we could possibly make is capped at N * N. In our algorithm, we’d potentially visit a word multiple times because there could be many routes to that word, but it won’t be N * N times. Nonetheless, we could cap it at that. Furthermore, in the course of each visit we have to check through the patterns associated with the word (M patterns), and if we encounter a new word, we make a copy of its index_list Time complexity here is, therefore, O(N^2*M).

With regards to space, the size of the queue will never exceed N x M. Space complexity here is, therefore, O(N*M).

Overall Complexity

Time complexity → O(N^2* M)
Space complexity → O(N*M)

Conclusion

There you have it; the solution to LC 126: Word Ladder II . It was an interesting problem we enjoyed solving, and we hope we’ve at least done enough to put you on the path of solving not only this problem, but problems like this. Tell us what you think, and if you have any ideas on how to optimise the solution, we’d be glad to hear it!

LeetCode 126: Word Ladder II Solution

Table of Contents