C程序从字符串中删除重复的字符

问题描述：

我遇到了一个采访问题，要求在给定的字符串中删除重复的字符。所以如果输入是“hi there”，那么预期的输出是“hi ter”。它还被告知只考虑按字母顺序排列，所有的字母都是小写字母。我想出了以下程序。我有意见让我的逻辑清楚。但该计划不能按照某些投入的预期工作。如果输入是“hii”，它会起作用，但是如果它的“hi there”失败了。请帮忙。C程序从字符串中删除重复的字符

#include <stdio.h> 
int main() 
{ 
    char str[] = "programming is really cool"; // original string. 
    char hash[26] = {0}; // hash table. 
    int i,j; // loop counter. 
// iterate through the input string char by char. 
for(i=0,j=0;str[i];) 
{ 
    // if the char is not hashed. 
    if(!hash[str[i] - 'a']) 
    { 
     // hash it. 
     hash[str[i] - 'a'] = 1; 
     // copy the char at index i to index j. 
     str[j++] = str[i++]; 
    } 
    else 
    { 
     // move to next char of the original string. 
     // do not increment j, so that later we can over-write the repeated char. 
     i++; 
    } 
} 

// add a null char. 
str[j] = 0; 

// print it. 
printf("%s\n",str); // "progamin s ely c" expected. 

return 0;

}

请问您有在预期输出一个错字;如果它不是“hi there” - >“hi ter”或“Hi there” - >“Hi ther”？ – Christoffer 2010-02-11 11:33:07

这一个函数是一个关联数组，但绝对不是一个哈希表。 – kennytm 2010-02-11 12:04:20

答

时str[i]是一个非字母，说的空间，当你这样做：

hash[str[i] - 'a']

程序可以吹。

的space ASCII值是32和的a是97所以你有效访问数组散列具有负指数。

为了解决这个问题，你可以通过做忽略非字母：

if(! isalpha(str[i]) { 
    str[j++] = str[i++]; // copy the char. 
    continue; // ignore rest of the loop. 
}

您也许只需要对小写字母进行明确的检查？ – 2010-02-11 11:45:08

答

这是要上，因为你任何空格字符（或任何其他范围外的“A” ..“Z”）打破正在访问您的散列阵列的界限之外。

答

...

// iterate through the input string char by char. 
for(i=0,j=0;str[i];) 
{ 
    if (str[i] == ' ') 
    { 
    str[j++] = str[i++]; 
    continue; 
    } 

    // if the char is not hashed. 
    if(!hash[str[i] - 'a']) 
    {

...

如果除了空格之外还有一个非英文字母，比如说'？'，这将再次失败。 – gameover 2010-02-11 11:39:42

@gameover：好点 – tur1ng 2010-02-11 11:45:20

答

这是代码高尔夫球，对吧？

d(s){char*i=s,*o=s;for(;*i;++i)!memchr(s,*i,o-s)?*o++=*i:0;*o=0;}

你应该忽略空格。 – kennytm 2010-02-11 12:14:50

您应该将其发布到IOCCC ...;） – t0mm13b 2010-02-11 12:56:04

答

#include <stdio.h> 
#include <string.h> 

int hash[26] = {0}; 

static int in_valid_range (char c); 
static int get_hash_code (char c); 

static char * 
remove_repeated_char (char *s) 
{ 
    size_t len = strlen (s); 
    size_t i, j = 0; 
    for (i = 0; i < len; ++i) 
    { 
     if (in_valid_range (s[i])) 
    { 
     int h = get_hash_code (s[i]); 
     if (!hash[h]) 
     { 
      s[j++] = s[i]; 
      hash[h] = 1; 
     } 
    } 
     else 
    { 
     s[j++] = s[i]; 
    } 
    } 
    s[j] = 0; 
    return s; 
} 

int 
main (int argc, char **argv) 
{ 
    printf ("%s\n", remove_repeated_char (argv[1])); 
    return 0; 
} 

static int 
in_valid_range (char c) 
{ 
    return (c >= 'a' && c <= 'z'); 
} 

static int 
get_hash_code (char c) 
{ 
    return (int) (c - 'a'); 
}

两件事：in_valid_range（）只是islower（），因此为简洁起见可以将其删除; get_hash_code（）（和所有哈希代码的处理）应该是无符号的，因为'char'可能不是，因此（int）（c - 'a'）可能是负数。 – unwind 2010-02-11 12:11:04

答

char *s; 
int i = 0; 

for (i = 0; s[i]; i++) 
{ 
    int j; 
    int gap = 0; 
    for (j = i + 1; s[j]; j++) 
    { 
     if (gap > 0) 
      s[j] = s[j + gap]; 
     if (!s[j]) 
      break; 
     while (s[i] == s[j]) 
     { 
      s[j] = s[j + gap + 1]; 
      gap++; 
     } 
    } 
}

答

void striprepeatedchars(char *str) 
{ 
    int seen[UCHAR_MAX + 1]; 
    char *c, *n; 

    memset(seen, 0, sizeof(seen)); 

    c = n = str; 
    while (*n != '\0') { 
     if (!isalpha(*n) || !seen[(unsigned char) *n]) { 
      *c = *n; 
      seen[(unsigned char) *n]++; 
      c++; 
     } 
     n++; 
    } 
    *c = '\0'; 
}

C程序从字符串中删除重复的字符

相关推荐