Skip Navigation
 
Follow:
Facebook Twitter LinkedIn YouTube
Share:

Home  >  Membership & Services  >  Students  >  Awards

 

 

 

closeClose

Below are sample problems used in the IEEEXtreme 2006 challenge.

 

Cosine Similarities

Cosine simmilarities

Value rank: 40

Simple matching, Jaccard, Tanimoto and cosine similarities are statistics used for comparing the similarity and diversity of sample sets. They are used in data mining for tasks ranging from classifying diverse chemical compounds to text processing and searching in large databases.

The sample sets for comparison are represented with vectors of the attributes we want to compare. Given 2 vectors of attributes, A and B, the cosine similarity, θ, can be calculated using the dot product and the magnitudes as:

problem_1

Since the angle, θ, is in the range of [0,π], the resulting similarity will yield the value of π as meaning exactly opposite, π / 2 meaning independent, 0 meaning exactly the same, with in between values indicating intermediate similarities or dissimilarities.

Your task is to write a program that receives A and B, attribute vectors, and outputs the

cosine similarity of those vectors A and B will be entered as parameters in the command line, as square-brackets-delimited, comma-separated lists of integer values, separated by a space. 

The output should be a single line containing the cosine simmilarity value with 4 decimals of precision, or the word ERROR if any of the entries is not valid or the operation cannot be performed

examples

> program [3,2,0,5,0,0,0,2,0,0] [1,0,0,0,0,0,0,1,0,2]
1.2503

> program [6,5,5] [1,1]
Error

> program [,] []
Error

 
 

top of page

 

Telephone Keyboard Input Recognition

Telephone keyboard input recognition

Value rank: 60

On a standard telephone, the numbers 1-9 can be used to correspond to a set of letters:

1: space            2: ABC     3: DEF       4: GHI               5: JKL       6: MNO

7: PQRS           8: TUV      9: WXYZ

Using the keypad, you can 'spell' words by entering the digits that correspond to each letter of the word. For example, 'words' is spelled 96737.

For this problem, we are given a dictionary file called with no more than 100,000 words, one per line, sorted in alphabetical order. Each word is comprised of no more than 18 characters, all lowercase letters from the phone keypad. Here is a (very short!) example of a dictionary file we will use in the examples:

Your program should read a string of digits (from 2 to 9, not using 1 as space) from the console and find the words in the dictionary whose spellings contain that series of consecutive digits anywhere within the word.

• If there are no matches, print the string 'No matches'
• If there is one match, print the matching word.
• If there are n>1 matches, print the string 'n matches:' followed by the matching

words, one per line.

NOTE: To make it easier to read the examples below, these are the 'spellings' of the words in words.txt, in digits:

cappuccino: 2277822466
chocolate: 246265283
cinnamon: 24662666
coffee: 263333
latte: 52883
vanilla: 8264554

examples

 contents of words.txt:

cappuccino
chocolate
cinnamon
coffee
latte
vanilla

> program words.txt 22222
No matches

> program words.txt 3333
coffee

> program words.txt 626
2 matches:

chocolate
cinnamon

 
 

top of page