Below are sample problems used in the IEEEXtreme 2006 challenge.
Value rank: 40
Simple matching, Jaccard, Tanimoto and cosine similarities are statistics used for comparing the similarity and diversity of sample sets. They are used in data mining for tasks ranging from classifying diverse chemical compounds to text processing and searching in large databases.
The sample sets for comparison are represented with vectors of the attributes we want to compare. Given 2 vectors of attributes, A and B, the cosine similarity, θ, can be calculated using the dot product and the magnitudes as:
Since the angle, θ, is in the range of [0,π], the resulting similarity will yield the value of π as meaning exactly opposite, π / 2 meaning independent, 0 meaning exactly the same, with in between values indicating intermediate similarities or dissimilarities.
Your task is to write a program that receives A and B, attribute vectors, and outputs the
cosine similarity of those vectors A and B will be entered as parameters in the command line, as square-brackets-delimited, comma-separated lists of integer values, separated by a space.
The output should be a single line containing the cosine simmilarity value with 4 decimals of precision, or the word ERROR if any of the entries is not valid or the operation cannot be performed
> program [3,2,0,5,0,0,0,2,0,0] [1,0,0,0,0,0,0,1,0,2]
> program [6,5,5] [1,1]
> program [,] 
Telephone keyboard input recognition
Value rank: 60
On a standard telephone, the numbers 1-9 can be used to correspond to a set of letters:
1: space 2: ABC 3: DEF 4: GHI 5: JKL 6: MNO
7: PQRS 8: TUV 9: WXYZ
Using the keypad, you can 'spell' words by entering the digits that correspond to each letter of the word. For example, 'words' is spelled 96737.
For this problem, we are given a dictionary file called with no more than 100,000 words, one per line, sorted in alphabetical order. Each word is comprised of no more than 18 characters, all lowercase letters from the phone keypad. Here is a (very short!) example of a dictionary file we will use in the examples:
Your program should read a string of digits (from 2 to 9, not using 1 as space) from the console and find the words in the dictionary whose spellings contain that series of consecutive digits anywhere within the word.
• If there are no matches, print the string 'No matches'
• If there is one match, print the matching word.
• If there are n>1 matches, print the string 'n matches:' followed by the matching
words, one per line.
NOTE: To make it easier to read the examples below, these are the 'spellings' of the words in words.txt, in digits:
contents of words.txt:
> program words.txt 22222
> program words.txt 3333
> program words.txt 626