刮的网站内容与安全登录特定区域

问题描述：

我想凑一个网站，该网站的登录安全这里的一些特定的文本是在此使用curl http://www.digeratimarketing.co.uk/2008/12/16/curl-page-scraping-script/刮的网站内容与安全登录特定区域

的教程，但我无法实施这为我在这里的卷曲码是我卷曲脚本

$url = "http://aftabcurrency.com/login_script.php"; 

$ch = curl_init();  
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 

curl_setopt($ch, CURLOPT_URL, $url); 
$cookie = 'cookies.txt'; 
$timeout = 30; 

curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($ch, CURLOPT_TIMEOUT,   10); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); 
curl_setopt($ch, CURLOPT_COOKIEJAR,  $cookie); 
curl_setopt($ch, CURLOPT_COOKIEFILE,  $cookie); 

curl_setopt ($ch, CURLOPT_POST, 1); 
curl_setopt ($ch,CURLOPT_POSTFIELDS,"user_name=user&user_password=pass&passcode=code");  

$result = curl_exec($ch); 
curl_close($ch); 
$source = $result; 
if(preg_match("/(CC3300\">)(.*?)(<\/font>)/is",$source,$found)){ 
echo $found[2]; 
}else{ 
echo "Text not found."; 
}

例如aftabcurrency.com我只希望只报废“我们的服务事项！” （这段文字每天都在改变）

你不需要登录刮“我们的服务事项！”。它也显示给非登录用户，所以你可以为你节省那些麻烦！ –

我知道，但它只是一个例子我想复制一些文本内的登录安全页面 – user1447187

答

我会做的是“剪出”一个文本在开始和开始之间......在源文本开始的文字颜色613A75和和关闭< /字体>标签..这里是一个正则表达式的解决方案：

$source = file_get_contents("http://aftabcurrency.com/index.php"); 
if(preg_match("/(613A75\">)(.*?)(<\/font>)/is",$source,$found)){ 
echo $found[2]; 
}else{ 
echo "Text not found."; 
}

如果你想与体区域内的文本要做到这一点，在这里我添加源到源和更换$源=的file_get_contents ...以$源= $ result

还有其他方法可以做到这一点，DomDocument和xpath或简单的strpos/strstr/substr php函数。

我做你所提到的，但我总是得到文本没有找到看到我原来的问题 – user1447187

上面编辑的代码源代码（文本），我想提取的是这里 http://tinypaste.com/45fa0bed – user1447187

该源只适用于http://aftabcurrency.com/index.php和文本“我们的服务事项” - 现在你提供了我一个来自网站的新源代码，你想在那里被抓到什么？ – MilMike

刮的网站内容与安全登录特定区域

相关推荐