CodeSignal Solves It: pressingButtons

pressingButtons technical interview question solution

Time to call 1-800-CODEFIGHTS (that’s 1-800-2633344487)! Given a number, your job is to find all the possible strings that the number could represent on a telephone’s number pad. This week’s Interview Practice Task of the Week was pressingButtons. This programming problem has been asked in technical interviews at Google, Amazon, Uber, and Facebook. In other words, this is a classic question!

In classic CodeSignal Solves It fashion, I’m going to tell you to go solve the problem yourself first. After all, you have to actually practice solving interview questions in order to become a better coder! Once you’ve solved it, head back here and I’ll talk you through two variations on a solution.

pressingButtons Facebook interview question solution

Done? Okay, let’s get dialed in!

The technical interview question

We are given a picture of a standard number pad to help us:

Amazon web developer interview solution

Given a number as a string, for example “42”, we are supposed to write a function pressingButtons that returns a (sorted) list of all possible strings associated with that number. In the case of “42”, the “4” maps to a “g”, “h”, or “i”, and the “2” maps to an “a”, “b”, or “c”, so:

print pressingButtons("42")
["ga","gb","gc","ha","hb","hc","ia","ib","ic"]

The politics of programming languages

One of the interesting things about this challenge is that we have written our solutions in Python. In Python, there is a very easy way of approaching this problem using the itertools package. When solving a problem in the real world (i.e. not in a technical interview) it is great when your problem is solved in a standard package! But in an interview setting, an interviewer asking you to sort a list is probably wanting you to implement a sorting algorithm instead of just using the language’s built-in sorting method.

Itertools is a borderline package. On the one hand, it’s built into the language. You need to show that you understand iterators in order to use it. It is also typically not taught in computer science classes, where they want you to make iterators by hand. So the fact that you know about itertools shows your real-word experience. Making your own custom solution when interviewing for a position that needs 3 years of Python experience might leave your interviewer wondering “doesn’t this candidate know about itertools?”

On the other hand, we try to serve a wide audience here at CodeSignal. Knowing that the itertools package is built in to Python doesn’t help when you are applying for that senior Java engineer position! I will show the itertools solution, and then show a solution that will work in a broader range of lower level languages.

using Python in technical interviews
Sometimes using Python feels like cheating.

Even if you are using Python in a technical interview, my advice would be to ask your interviewer if you are allowed to use the itertools package. Chances are your interviewer will say “no”, but at least you’ve demonstrated to her that you know the standard way of approaching one of these problems.

A solution using itertools

Here is the Python code that uses itertools:

from itertools import product

def pressingButtons(buttons):
    numPad = [""   ,""    ,"abc","def" ,"ghi","jkl",
              "mno","pqrs","tuv","wxyz"]
    charArr = [numPad[int(digit)] for digit in buttons]

    return [''.join(s) for s in product(*charArr) if s]

The array numPad stores the possible letters for each character. There are no letters associated with 0 or 1, so the entries are empty strings. We see numPad[4] = "ghi" and numPad[2] = "abc", for example. Next, we make an array of all the possible numbers we can substitute in charArr. Continuing with the example of 42, we would have charArr=["ghi","abc"].

So far, everything looks the same as it does in the more general solution. We now have to find a way of getting all the possible ways of getting a single letter from each of these strings. This is where the itertools product function does all the heavy lifting.

What product does is take an arbitrary number of iterables (think lists or strings) and returns a generator. Generators are really cool – they allow us to produce values “on demand” so we don’t have to stress the machine too much – but they are a whole other topic on their own. For the moment, we can force product to produce a list by casting it. The product operator is easy to understand from an example:

list(product([1,2,3],"ab")) = [(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')]

i.e. list(product( it1, it2, it3, ..., itN ) ) returns a list where each element is a tuple where the first element is from it1, the second elements is from it2, the third is from it3, and so on. Every possible combination is represented once.

Notice that if we have charArr = ["ghi","abc"] that list(product(charArr)) doesn’t give us quite what we want:

list(product(charArr)) = [('ghi',), ('abc',)]

This is because we have passed only one argument, charArr. It is iterable since there are two elements (“ghi” and “abc”), so it takes the first element and then the second. Python’s notation for a tuple with a single element looks strange: ("ghi",) is a tuple with only one element "ghi". The comma shows that this is a tuple and not just standard parenthesis. Notice that if we pass the two strings "ghi" and "abc", we get a different result:

list(product("ghi","abc")) =  [('g', 'a'), ('g', 'b'), ('g', 'c'),
                               ('h', 'a'), ('h', 'b'), ('h', 'c'),
                               ('i', 'a'), ('i', 'b'), ('i', 'c')]

The Python star operator (*) allows us to “unpack” a list, and pass the entries directly to a function. So we can write:

# charArr = ["ghi","abc"] as before
list(product(*charArr)) =  [('g', 'a'), ('g', 'b'), ('g', 'c'),
                            ('h', 'a'), ('h', 'b'), ('h', 'c'),
                            ('i', 'a'), ('i', 'b'), ('i', 'c')]

Now we just have to join each of these tuples such as ('g','a') into a string such as "ga". We can accomplish this with the join functions:

[''.join(s) for s in list(product(*charArr))] = ['ga', 'gb', 'gc',
                                                 'ha', 'hb', 'hc',
                                                 'ia', 'ib', 'ic']

We can get rid of the list statement now since we are feeding the output of product to a for loop. This is just the thing that generators were built for!

We have covered almost all the code in the return statement in our original function. The only missing piece is an if s at the end! The if s at the end deals with the corner case of what happens when you pass in an empty string for buttons (i.e. “” instead of “42”). The comments section on this question are all addressing what the answer should be – you should have a look and see what you think the question statement instructions say about what should happen with the empty string – but our tests only pass if pressingButtons("") returns an empty list []

Recursive approaches to this question

Let’s try to do this without resorting to itertools. We want a function allStrings that takes a list of strings charArr, and returns a list of all the possible strings with the first character from charArr[0], the second character from charArrr[1], etc. Our approach will be:

  1. If charArr only has one element, then return a list of the letters of charArr[0]. For example, if charArr=["abc"] then return ['a','b','c'].
  2. If charArr has more than one element, find the list of all the strings of every element except for the first by using the recursive call allStrings(charArr[1:]). Then append every string in this list with every letter from charArr[0].

In code, we have:

def allStrings(charArr):
    # base cases: charArr only has 0 or one element
    if len(charArr) == 0:
        return []
    if len(charArr) == 1:
        return list(charArr[0])

    # find the list of strings
    remaining = allStrings(charArr[1:])
    output = []

    for start_char in charArr[0]:
        # place start_char at the beginning of each string in
        # remaining
        output.extend([start_char + s for s in remaining])
    return output

Putting our solution together, we have:

# uses the earlier function allStrings
def pressingButtons(buttons):
    numPad = [""   ,""    ,"abc","def" ,"ghi","jkl",
              "mno","pqrs","tuv","wxyz"]
    charArr = [numPad[int(digit)] for digit in buttons]

    return allStrings(charArr)

Tell us…

Have you ever been in an interview where you could have used a built-in method to solve a challenge, but were told not to? Or the opposite – the interviewer told you to go ahead and use it? Let us know over on the CodeSignal forum!

CodeSignal Solves It: goodStringsCount

programming jobs technical interview problem solution

Our Interview Practice challenge this week is goodStringsCount, a combinatorics problem. As the name suggests, combinatorics deals with combinations of objects that belong to finite sets, and it’s one of those topics that come up a lot in technical interviews. This specific coding problem is from Apple, which makes sense since they’re known for asking combinatorics questions in their technical interviews!

If this isn’t your first CodeSignal Solves It rodeo, I bet you know what I’m going to ask next: Have you solved goodStringsCount yet? If not, hop to it! Once you’ve got your solution, come back and we’ll walk through the problem together.

combinatorics interview question solution
The new MacBook Pro looks really nice!

Done? Great! Let’s get into it.

The technical interview problem

For this programming interview question, our job is to write a function that counts “good strings” of length n. A “good string” is defined as a string that satisfies these three properties:

  1. The strings only contain the (English) lowercase letters a – z.
  2. Each character in the string is unique.
  3. Exactly one letter in the string is lexicographically greater than the one before it.

The first two conditions tell us that any string made up of lowercase letters qualifies. For example, there are no good strings of length 1, because the single letter doesn’t have anything to be greater than! In other words, goodStringsCount(1) should return 0.

Looking at good strings of length 3

Strings of length 2 are actually a little misleading, because they don’t really allow us to dig into the third condition on good strings. It just tells us that the characters have to appear in ascending order, as the second character has to be greater than the first. So let’s start by looking for good strings of length 3 instead!

The third condition tells us that “bfg” is not a good string because all the letters appear in ascending order. But “bgf” is a good string since:

  • looking at the first two characters: ‘b’ < ‘g’ (so there is a letter greater than the one that precedes it)
  • looking at the next two characters: ‘g’ > ‘f’ (so there is only one letter greater than the one that precedes it)

From the letters b, f, and g, there are six possible strings, of which 4 are “good strings”.

“bfg”, “bgf”, “fbg”, “fgb”, “gbf”, “gfb”

There is nothing particularly special about the characters b, f, and g. We could use a, h, and k as well (just replace b with a, h with k, and g with k in the list above). In fact, any three distinct letters would do, since the only property we used was their relative order. So we have:

[math] {\text{numGoodStringsOfLength3}} = 4 \times (\text{numWaysOfPicking3LettersFrom26}) = 4 \times {{26}\choose{3}} = 4 \times \frac{26!}{4! 22!}[/math]
where [math]n \choose {r}[/math] is n choose r, the function for counting the number of ways of choosing r elements from n. This is a common function in combinatorics problems, so I’ve included a discussion of it as a footnote to this article! We’ll be using this function a lot, so if you haven’t come across it before skip to the end and read about it now.

Overall strategy for this problem

After looking at a specific case, we get an idea of the structure of this problem. For finding the good strings of length n:

  • We need to find the number of ways of picking n letters, which is 26 choose n.
  • Given the numbers 1, 2, …, n, we need to find the number of ways we can arrange them in a “good sequence”. A “good sequence” is when exactly one number is greater than the one before it. We will call the number of good sequences of 1, 2, …, n good(n).

The idea is that for each set of n letters, we can put them in alphabetical order, and label the letters from 1 to n by their order in this array. Any good string then gives a “good sequence” of 1 to $n$, and any “good sequence” corresponds to a unique string (from these letters). For example, using n=3 and the letters b,f, g:
* The good sequence [2,3,1] corresponds to the good string “fgb”
* The good string “gbf” corresponds to the good sequence [3,1,2]

The number of good strings of length n is then [math]{{26}\choose{n}} \times \text{good}(n)[/math].

Finding good(n)

Let’s call our sequence [ a[1], a[2] , ..., a[n] ], where each a[i] is distinct and takes a value from 1 to n, inclusive. If this is a “good sequence”, then there is exactly one element a[j-1] that is greater than the element before it, a[j]. Note that this means that:

  • a[1] > a[2] > a[3] > … > a[j], i.e. the elements before a[j+1] make a decreasing sequence
  • a[j+1] > a[j+2] > a[j+3] > … > a[n], i.e. the elements from a[j+1] onward make a decreasing sequence

For example, a good sequence [4,2,5,3,1] can be broken into the two decreasing sequences [4,2] followed by [5,3,1]. In the notation above the value of j in [4,2,5,3,1] is 2 because a[2+1] = 5 is the only element bigger than the previous element.

Given the list of n numbers, there are n-1 choices for a[j]. (You can’t use the very last element, because there is no a[j+1].) Another way of seeing this is that we are splitting the array into two decreasing subarrays, and the only places that we can split on are the locations of the commas!

For now, let’s pick a particular value of j and ask how many good sequences of 1 through n we can make with this value of j. Given a set of distinct numbers, there is only one way of making them into a decreasing sequence, so really we just need to pick which j numbers go in the first decreasing sub-array, and which n-j numbers go in the second decreasing array. Note that once we pick the j numbers for the first array, the remaining numbers automatically go into the second array, so the number of distinct ways of splitting these numbers up is really just [math]{{n}\choose{j}}[/math]. (We have counted one problematic sequence in here, which we’ll come back to.)

In case you got lost in the n and js, here’s an example to help you out. Let’s pick n = 5 and j=2. Instead of counting all the good sequences, we are just counting the ones where the split happens at position 3. So we have:

[a[1], a[2]] make a decreasing sequence, and [a[3],a[4],a[5]] make a decreasing sequence.

How many sequences like this are there? I need to split 1,2,3,4,5 up into two pieces: two elements (i.e. j) become a[1] and a[2], while the remaining three elements (n-j) become a[3] through a[5]. But once I pick a[1] and a[2], whatever is left over has to go into the other array. There are [math]{{5}\choose{2}} = 10[/math] ways of picking a[1] and a[2]:
* a[1] and a[2] are 1,2: the sequences are [2,1] [5,4,3] -> [2,1,5,4,3]
* a[1] and a[2] are 1,3: the sequences are [3,1] [5,4,2] -> [3,1,5,4,2]
* a[1] and a[2] are 1,4: the sequences are [4,1] [5,3,2] -> [4,1,5,3,2]
* …
* a[1] and a[2] are 3,5: the sequences are [5,3] [4,2,1] -> [5,3,4,2,1]
* a[1] and a[2] are 4,5: the sequences are [5,4] [3,2,1] -> [5,4,3,2,1] (UH OH!)

The last one is problematic, because all the numbers are decreasing! So while there are 10 ways of making the split into two decreasing sequences with n=5 and j=2, there are only 9 good sequences (with n=5 and j=2) because these is one sequence where all the numbers are decreasing.

The example shows us the problem with the way we have counted by splitting into two decreasing arrays: We never made sure that a[j+1] was bigger than a[j]. Fortunately there is only one arrangement that doesn’t work: the arrangement [n,n-1,...,1], so we have over-counted by exactly one. The number of good sequences split of 1 through n, split into arrays of length j and n-j is therefore [math]{{n}\choose{j}} – 1[/math].

To determine good(n), we just need to add up all the different good sequences we can get by putting the cutoff between the two arrays at different locations j. We get:

[math]\text{good(n)} = \large\sum_{j=1}^{n-1} \bigg[{{n}\choose {j}} – 1\bigg] [/math]

This gets us to a point where we can write a program that will run quickly enough, so if you’re totally “mathed out” you can stop here! We can do a little bit better, through. Here are a couple of clever tricks:

  • [math]{{n}\choose0} = {{n}\choose {n}} = 1[/math], because there is 1 way of not choosing anything from n objects regardless of n – just don’t pick any of them! – and 1 way of choosing all n objects from n objects. So putting j=0 and j=n into [math]\bigg[{{n} \choose {j}} – 1\bigg][/math] gives zero. Now we can rewrite:
    [math]\text{good(n)} = \large\sum_{j=1}^{n-1} \bigg[{{n}\choose {j}} – 1\bigg] = \large\sum_{j=0}^{n} \bigg[{{n}\choose {j}} – 1\bigg] = \large\sum_{j=0}^{n} \bigg[{{n}\choose {j}}\bigg] – (n+1)[/math]
  • You might remember from Pascal’s triangle that
    [math]\large\sum_{j=0}^{n} \bigg[{n\choose {j}}\bigg] = 2^n[/math]
    One way of seeing this result is: The sum is asking us about the number of ways we can select a subset of n elements by figuring out the number of subsets with 0 elements, the number of subsets with 1 element, et cetera. Since each element is either in a subset or not, there are 2^n different subsets.

After all this work, we get:

[math]\text{good}(n) = \text{(number of good sequences of [1, 2, 3, …, n])} = 2^n – (n+1)[/math]

Putting it all together

From our overall strategy, we had [math]{26\choose {n}}[/math] ways of picking the n characters, and for each of our choices we had good(n) ways of arranging those characters into good strings. So the number of good strings of length n is:

[math]{26\choose {n}} \times (2^n – n – 1)[/math]

…and now, some code!

The hard part about this problem isn’t writing the code, but figuring out how to count efficiently. The code is pretty small in this case:

from math import factorial as fact

def nChooseR(n,r):
    # there are better ways that don't involve using
    # dealing with large integers. For the moment,
    # I am opting for pragmatism.
    return fact(n)/((fact(n-r)*fact(r)))

def goodStringsCount(n):
    return nChooseR(26,n)*(2**n - n - 1)

Footnote: Ways of choosing r objects from n objects

How do we get the formula for [math]{n\choose {r}}[/math] in the first place? Suppose we have n=6 objects: a flag, a cookie, a laptop, a pen, a cup of coffee, and an orange. Note that we have picked easy-to-distinguish items! We want to select r = 4 of them. How many different sets of items could we come up with?

We have n choices for which item we pick first. We have n-1 choices for the second item, and so on. So it seems like the result is n(n-1)(n-2)(n-3). This becomes a little inconvenient to write for a general r (in this case, we know that r = 4), but notice that:
[math]n(n-1)(n-2)(n-3) = n(n-1)(n-2)(n-3)\times \frac{(n-4) \ldots (2)(1)}{(n-4)\ldots(2)(1)} = \frac{n!}{(n-4)!}[/math]
This formula over-counts the number of different sets of items we could have because selecting the laptop, then the coffee, then the orange, and then the pen would give you the same set of items as selecting the coffee, followed by the orange, the pen, and the laptop. In the formula above, we have counted these two arrangements separately! (This is called a permutation of selecting 4 items from n, and is another useful formula to have under your belt).

How many different orders are there for selecting these 4 items? This is the number of times we have over-counted each set of items we could end up with. We’ll have 4 choices for whichever one we could have picked first (laptop, coffee, orange, or pen) without affecting the items we end up with. With the first item selected, we have 3 items to choose from for the second item, 2 choices for the third item, and then the last item is the 1 thing left over. So we can rearrange these 4 items in [math]4\times 3 \times 2 \times 1 = 4![/math] ways. The number of combinations for picking r=4 things from n items is:

[math]{n \choose 4} = \frac{n!}{(n-4)!4!}[/math]

You can (and should!) run through this argument for a general value of r, and convince yourself that the number of ways of choosing r items from n is:

[math]{n \choose {r}} = \frac{n!}{(n-r)!r!}[/math]

Tell us…

How did you solve this combinatorics interview question? Did you solution differ from mine? Let us know over on the CodeSignal forum!