C# Practice Interview test questions

Question 1

Write a function that called top_3_words, when given a string of text (with punctuation), will return an array of the top-3 most occurring words, in descending order according to the number of occurrences.

Assumptions:

  • A word is a string of letters (A to Z)
  • Matches should be case-insensitive, and the words in the result should be lowercased.
  • If a text contains fewer than three unique words, then either the top-2 or top-1 words should be returned, or an empty array if a text contains no words.

Example 1

top_3_words(@"And here are two of the most immediately useful thoughts you will dip into. 
First that things cannot touch the mind: they are external and inert; anxieties can only come from your internal judgement. 
Second, that all these things you see will change almost as you look at them, and then will be no more. 
Constantly bring to mind all that you yourself have already seen changed. 
The universe is change: life is judgement.")

Result:

> ["a", "of", "on"]

Example 2

top_3_words("e e e e DDD ddd DdD: ddd ddd aa aA Aa, bb cc cC e e e")

Result:

> ["e", "ddd", "aa"]

Example 3

top_3_words("  //wont won't won't")

Result:

> ["won't", "wont"]

Bonus points (not really, but just for fun):

  1. Avoid creating an array whose memory footprint is roughly as big as the input text.
  2. Avoid sorting the entire array of unique words

SOLUTION

using System;
using System.Linq;
using System.Collections.Generic;

	public static List Top3(string s)
	{
		var frequencies = new Dictionary();
		var punctuation = s.Where(Char.IsPunctuation).Distinct().ToArray();
		var words = s.Split(' ').Select(x => x.Trim(punctuation)).ToList();
		words = words.ConvertAll(d => d.ToLower());
		foreach (var currentWord in words)
		{
			if (!string.IsNullOrWhiteSpace(currentWord))
			{
				if (!frequencies.ContainsKey(currentWord))
				{
					frequencies.Add(currentWord, 1);
				}
				else
				{
					frequencies[currentWord]++;
				}
			}
		}

		return frequencies.OrderByDescending(r => r.Value).Take(3).Select(r => r.Key).ToList();
	}
}

Explanation

  1. We create a frequencies dictionary, a dictionary allows us to associate a "key" with a "value". The strategy here is to associate each unique word with the number of times it appears.

  2. var punctuation = s.Where(Char.IsPunctuation).Distinct().ToArray(); this code extracts all the punctuation from the input string and puts it into an array. We're going to use this array in the next line.

  3. var words = s.Split(' ').Select(x => x.Trim(punctuation)).ToList();

    • s.Split(' ') splits each of the words by space and creates an array
    • .Select(x => x.Trim(punctuation) selects each of the split words and trims any of the punctuation extracted earlier. String.Trim(Char[]) removes the inputted char or array of chars from both the right or the left side (leading and trailing) of a string. In this case we're asking it to remove the characters specified in an array from the current string.
  4. Once we're converted all the words in our array of split and trimmed words to lowercase, using ConvertAll() we move over to our foreach loop where we'll iterate through all the words in the array and start assigning them to our dictionary.

    We filter out the elements that are null/white space.

    Now, this is the beautiful part of this solution - we check whether the word is already in the dictionary, if it isn't we add it and assign it a value of 1. If it is already present, we increment its value.

  5. Finally, we order the dictionary values by their value, take the top 3 and return their keys to a list. Boom. Top 3 values.

Bonus - pure Linq solution

using System;
using System.Linq;
using System.Text.RegularExpressions;
using System.Collections.Generic;

public class TopWords
{
    public static List Top3(string s)
    {
        return Regex.Matches(s.ToLowerInvariant(), @"('*[a-z]'*)+")
            .GroupBy(match => match.Value)
            .OrderByDescending(g => g.Count())
            .Select(p => p.Key)
            .Take(3)
            .ToList();
    }
}