JSON解析和从DIV标签

问题描述:

这里提取信息是我的代码:JSON解析和从DIV标签

<?php 
$ch = curl_init("http://gothere.sg/a/search?q=527201"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); 
$raw = curl_exec($ch); 
curl_close($ch); 

$data = json_decode($raw); 
echo htmlentities($data->where->html); 
?> 

继承人的输出:

<div class=place><img class=marker src="/static/img/2/icon/panel/a.png?v=c2354"/><div class=locf><strong>201E Tampines Street 23</strong><br> Singapore 527201</div><p><a id="tooldt" href="">directions to</a> <a id="tooldf" href="">directions from</a> <a id="toolsn" href="">search nearby</a></p><div id="minibar"><p></p><form class=msf><input type=text><input type=submit value=""><input type=hidden value="527201"></form></div></div><div id=bah><div class=bar><h4>Some businesses around here:</h4></div><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="5314c3d8-9775-4a4c-bbed-c28a04126993">United Employment Services</a>, #02-102</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="05aa7169-4fad-4577-95b5-e79ef411c6f1">Cleverland Educational Services</a>, #04-106</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="b323d00e-5e4a-45a0-a35f-f196e33c51f3">Tampines Women's Clinic</a>, #01-112</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="cf5b1145-334d-472e-a965-a2f8ab31da4b">Ming Shing Pawnshop Pte Ltd</a>, #01-96</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="7cbe2217-b763-4e1d-81de-7f0f8d1be0bb">Froggies</a>, #04-96</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="43798461-d418-4ac1-a2b5-9f7359f538f5">Tampines St 23 (POSB)</a>, #01-100</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="2703952e-bfc0-46cd-a981-479ae751b1e4">Arrow Communication</a>, #01-76</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="8bad697d-42e0-48bb-8cd3-57601e42a39f">Efficient Tuition Centre</a>, #03-102</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="3e6f08ed-3917-47fa-a434-19e2d90a7682">Guardian - Tampines St 23 Blk 201E</a>, #01-94</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="3c49c4e8-dc25-49ab-8481-0a8369ba20d7">Yes Boss Food Centre</a></p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="a8d53b86-4b39-4d09-b8d7-67223228e3dd">Universal Medical Clinic</a>, #01-104</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="4baed1a5-8b5f-4f53-b1b6-056a60ce2a4c">Tampines Pawnshop Pte Ltd</a>, #01-86</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="9258c2ba-9e4f-48b0-aeab-d98c312e5328">Afghanistan Family Restaurant</a>, #01-56</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="cb8cdfd6-1203-49e3-a64a-b1d7a64384cc">7 Eleven</a>, #01-100</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="6c3c8595-b3dd-4fe0-9592-c053392a5036">Hairsolutions (Unisex)</a>, #01-118</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="307b258e-ce57-46f0-924b-9ea1a01b49f0">Phase Hairdressing - North Bridge</a>, #01-64</p><p id=baha><a href="">+ show all</a></p></div><div id=aah><div class=bar><h4>Browse amenities around here: </h4></div><img class=marker src="/static/img/2/icon/panel/amenities.png?v=ce268"/><p><a value=4 href=>ATMs</a><a value=5 href=>Banks</a><a value=1 href=>Clinics</a><a value=6 href=>Petrol Kiosks</a><br><a value=2 href=>Post Offices</a><a value=3 href=>Schools</a><a value=0 href=>Supermarkets</a></p></div> 

因此如何从<div class=locf><strong>201E Tampines Street 23</strong><br> Singapore 527201</div>提取数据???这是我想要的唯一信息。无论如何,我可以消除<strong> <br>标签,一旦我已经提取?

+0

可能的重复http://*.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php – Thayne

如果你的HTML是相当稳定的,你也许可以只使用正则表达式,例如像这样(未测试,而不是非常强劲):

$match = array(); 
if(preg_match('<div class=locf>(.*?)</div>',$data->where->html,$match)) { 
    $locf = $match[1]; 
} else { 
    $locf = ''; 
} 

注意这个特殊的正则表达式会,如果你失败了在里面有一个嵌套div,它也对空格和大小写敏感。相对于空格和大小写来说,使它更加健壮是相对直接的,但是嵌套div问题更加棘手,可能需要的不仅仅是一个简单的正则表达式。

一旦你有$ locf你可以使用preg_replace替换它里面的任何或所有的html标签。

+0

警告:preg_match()期望参数2是字符串,对象给我如何解决这个问题? – user3036527

+0

对不起,您应该使用$ data-> where-> html作为第二个参数。即第二个参数应该是你的html字符串。 – Thayne