C++/C++
Expert: vijayan - 3/3/2010
QuestionHi,
I need to implement an algorithm for using c++ data structures.
I have got two files file1 and file2.
In file1, the words are split across individual lines. For example, each word in the sentence "I live in London" appears in separate lines. Each word in the file1 has their equivalents patterns. For example the word "live” has its equivalent pattern "VB KL BL". Normally, the number of lines in the file will be file in terms of thousands (4 or 5 thousands)
The way lines are stored in file1 as follows:
=============================================
Letter/word patterns1
I SJ DL KP
live VB KL BL
in IJ DL ML
London OJ KL BL
...... .. .. ..
...... .. .. ..
etc.....
file2 is rule file which contains the input patterns and their corresponding output patterns
file2
------
input pattern output pattern
DL KP MN OP
KL BL QR QP
SJ DL KP AB DF
VB DL BL FG KE
IJ DL ML KQ PT
OJ KL BL AK BM
DL ML BT
-- -- -- --- --
I need to parse file1 line by line and identify their corresponding patterns ( Ex: In the first line "SJ DL KP" is the pattern for letter "I"). Once the pattern1 (say "SJ DL KP" in file1) is identified then I have to search this pattern (pattern1) in the file2 (“input pattern" column). So in the file2 the equivalent pattern for "SJ DL KP" is "AB DF"("output pattern" column)
My requirement is to parse the file1 for each word and identify the corresponding pattern in patterns1 column. Once the pattern is identified then search this pattern in the file2 (under input pattern column). Then match the pattern identified in "input pattern" column with the "output pattern" column. Then I have to generate file3 with new pattern selected from "output pattern" column of file2.
I need to implement an efficient (low memory,less iterations, more speed) algorithm using c++. Can any one from here help me with proper design, data structures and iterators
Regards
Devamani
Answer> parse the file1 for each word and identify the corresponding pattern in patterns1 column.
This is trivial.
> then search this pattern in the file2
This is the key step. You could use a string searching algorithm like the Boyer–Moore string search algorithm:
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm
The rest of the steps just involve table look-ups.