【POJ 2001】 Shortest Prefixes 字典树 哈希

Description

A prefix of a string is a substring starting at the beginning of the given string. The prefixes of “carbon” are: “c”, “ca”, “car”, “carb”, “carbo”, and “carbon”. Note that the empty string is not considered a prefix in this problem, but every non-empty string is considered to be a prefix of itself. In everyday language, we tend to abbreviate words by prefixes. For example, “carbohydrate” is commonly abbreviated by “carb”. In this problem, given a set of words, you will find for each word the shortest prefix that uniquely identifies the word it represents.

In the sample input below, “carbohydrate” can be abbreviated to “carboh”, but it cannot be abbreviated to “carbo” (or anything shorter) because there are other words in the list that begin with “carbo”.

An exact match will override a prefix match. For example, the prefix “car” matches the given word “car” exactly. Therefore, it is understood without ambiguity that “car” is an abbreviation for “car” , not for “carriage” or any of the other words in the list that begins with “car”.
Input

The input contains at least two, but no more than 1000 lines. Each line contains one word consisting of 1 to 20 lower case letters.
Output

The output contains the same number of lines as the input. Each line of the output contains the word from the corresponding line of the input, followed by one blank space, and the shortest prefix that uniquely (without ambiguity) identifies this word.
Sample Input

carbohydrate
cart
carburetor
caramel
caribou
carbonic
cartilage
carbon
carriage
carton
car
carbonate
Sample Output

carbohydrate carboh
cart cart
carburetor carbu
caramel cara
caribou cari
carbonic carboni
cartilage carti
carbon carbon
carriage carr
carton carto
car car
carbonate carbona

题意:找出一个最短的前缀,且不和其他字符串的前缀相同

思路(字典树):

模板题了,来一个字符串用字典树记录所有前缀,然后search函数搜第一个和其他字符串不同的前缀位置,因为要排除它自身这个字符串的影响,就用哈希表在插入时记录每个前缀出现的次数,然后搜索过程中出现次数为1的字符串即是最小前缀

#include<cstdio>
#include<iostream>
#include<cstring>
#include <map>
#include <vector>
#include <string>
using namespace std;
const int MAX_NODE = 1000000 + 10;
const int CHARSET = 26;
int trie[MAX_NODE][CHARSET] = {0};
int color[MAX_NODE] = {0};
int k = 1;
map<string,int> flag;

void insert(string w){
    string t;
    int len = w.size();
    int p = 0;
    for(int i=0; i<len; i++){
        int c = w[i] - 'a';
        t += c;
        flag[t] ++;
        if(!trie[p][c]){        //看看第p位置往下有没有以c开头的字符串
            trie[p][c] = k;     //没有的话就新增标记
            k++;        //更新
        }
        p = trie[p][c];     //p指针往下移动
    }
    color[p] = 1;
}

int search(string s){
    int len = s.size();
    int p = 0;
    string t;
    for(int i=0; i<len; i++){
        int c = s[i] - 'a';
        t += c;
        if(!trie[p][c]||flag[t]==1) return i+1;       //如果没遍历完就到头了,说明不存在
        p = trie[p][c];  //否则往下走
    }
    return len;       //到底了还要看看是不是树的叶结点
}


int main()
{
    string s;
    vector<string> all;
    while(cin>>s)
    {
        insert(s);
        all.push_back(s);
    }
    for(int i=0;i<all.size();i++)
    {
        cout<<all[i]<<' ';
        int len = search(all[i]);
        string t1 = all[i];
        string t (t1,0,len);
        cout<<t<<endl;
    }
    return 0;
}


版权声明:本文为qq_45492531原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。