计数发出给我所有不同答案的单词“the”

问题描述：

因此，我有一个简单的脚本来从命令行读取文本文件，并且我想要计算“the”的数量，但我已经得到奇怪的数字。计数发出给我所有不同答案的单词“the”

while(<>){ 
    $wordcount= split(/\bthe\b/, $_);} 
    print "\"the\" occurs $wordcount times in $ARGV";

因此，使用，我得到的10个事件，但如果我用/ \的意见书\ B/I我得到12/\的意见书\ B /给我6，我相信。我的测试txt中有11次出现。我只是一个白痴？ $ wordcount应该从1还是0开始？用这种方法分裂也是不好的做法吗？该代码适用于实际计数的单词，但不是在计算确切的字符串时。新的perl所以任何和所有的虐待感激。谢谢

编辑：我也知道它没有添加，但现在我得到$ wordcount被视为更像一个数组，所以它为以前的迭代工作，虽然它肯定是糟糕的形式。

您正在覆盖每行的'$ wordcount'。所以你只打印最后一行的出现次数。如果你想要总数，你应该使用'+ ='而不是'='。 – Barmar 2014-10-03 02:55:51

他们都是错的。两者都是因为你没有添加（就像Barmar说的那样），并且因为'split'没有任何方法来计算匹配模式的事物的数量（它通常会但不总是太高）。 – hobbs 2014-10-03 03:07:12

答

split根据提供的正则表达式将字符串拆分为一个列表。您的计数来自您已将split置于标量环境中的事实。从perldoc -f split：

split Splits the string EXPR into a list of strings and returns the 
     list in list context, or the size of the list in scalar context.

鉴于字符串“敏捷的棕色狐狸跳过懒狗”我期望你$wordcount为2，这将是正确的。

The quick brown fox jumps over the lazy dog 
^^^============================^^^========= -> two fields

但是，如果你有“鸟和快速的棕色狐狸跳过懒狗”你最终以3这是不正确的。

A bird and the quick brown fox jumps over the lazy dog 
===========^^^============================^^^========= -> three fields

首先，你绝对会想要\b，因为它符合字边界。 \B匹配不是单词边界的东西，所以你会匹配任何包含“the”而不是单词“the”的单词。

其次，你只是想算发生 - 你这样做，通过计算整个字符串的匹配

$wordcount =() = $string =~ /\bthe\b/gi

$wordcount成为标量上下文列表，()是你实际上并没有因为捕捉列表你不想要比赛。 $string是匹配的字符串。您在字边界处匹配“the”，gi是整个字符串（全局），不区分大小写。

不知道我可以使用（）作为一个数组。感谢您提供丰富的答案 – 2014-10-03 03:26:18

@D_C请参阅['perlsecret']（http://search.cpan.org/dist/perlsecret/lib/perlsecret.pod#Goatse）以获取更多信息。 – chrsblck 2014-10-03 05:48:09

@chrsblck gotse让我大笑，但另外解释了所有的原因和我的过程。谢谢！ – 2014-10-03 07:17:34

答

使用/ i标志，'The'将被包含，但不是没有它。

\ B是非 -word边界，所以只能找东西像“穿衣”，并不“的”。

是的，以这种方式使用分割是不好的做法。正确，如果你只是想一个数，这样做：

$wordcount =() = split ...;

分标量方面做一些事情，似乎是一个不错的主意最初，但似乎并没有那么好了，所以避免它。上面的咒语在列表上下文中调用它，但将找到的元素的数量分配给$ wordcount。

但分裂the产生的元素不是你想要的;你想要找到the的次数。所以，做（可能与/ IG，而不是只/ G）：

$wordcount =() = /\bthe\b/g;

请注意，你可能想+ =，不等于，一共拿到了所有行。

答

使用正则表达式在列表环境中拉共查到：

my $wordcount = 0; 

while (<>) { 
    $wordcount +=() = /\bthe\b/g; 
} 

print qq{"the" occurs $wordcount times in $ARGV\n};

参考：perlfaq4 - How can I count the number of occurrences of a substring within a string?

答

sample.txt的

Ajith 
kumar 
Ajith 
my name is Ajith and Ajith 
lastname is kumar

代码

use Data::Dumper; 

print "Enter your string = "; 
my $input = <>; ## User input 
chomp $input; ## The chomp() function will remove (usually) any newline character from the end of a string 

my %count; 
open FILE, "<sample.txt" or die $!; ## To read the data from a file 
my @data = <FILE>; 

for my $d (@data) { 
    my @array = split ('\s', $d); ##To split the more than one word in a line 
    for my $a (@array) { 
     $count{$a}++;  ## Counter 
    } 
} 

print Dumper "Result: " . $count{$input};

上面的代码获得输入VAI命令提示，然后搜索词到给定的文本文件“sample.txt的”，然后显示它多少次出现在文本文件中的输出（样本.txt）

注意：用户输入必须是“区分大小写”。从用户

INTPUT

Enter your string = Ajith

输出

$VAR1 = 'Result: 4';

答

print "Enter the string: "; 
chomp($string = <>); 
die "Error opening file" unless(open(fil,"filename.txt")); 
my @file = <fil>; 
my @mt; 
foreach (@file){ 
@s = map split,$_; 
push(@mt,@s); 
} 
$s = grep {m/$string/gi} @mt; 
print "Total no., of $string is:: $s\n";

在此给您所期望的输出。

计数发出给我所有不同答案的单词“the”

相关推荐