php SimpleXMLelement解析具有多个'潜在'属性的XML标签

问题描述:

正如标题所暗示的,我有一个关于解析可能具有多个属性(或根本没有)的XML标签的问题,并且我在寻找建议如何实现这一目标;但首先,我认为有一点背景是为了。php SimpleXMLelement解析具有多个'潜在'属性的XML标签

我的工作称为Program O一个基于PHP AIML解释型的脚本,我从字符串替换功能迁移代码(例如str_replace函数,preg_replace函数等)的过程中使用PHP的内置SimpleXML函数。到目前为止,我为各种AIML标签创建的几乎所有解析函数都是完整的,并且工作得非常好,但是其中一个标签特别踢了我的座位加热器,这就是CONDITION标签。

根据AIML tag reference,标签有三种不同的“形式”:一种同时具有NAME和(VALUE | CONTAINS | EXISTS)属性,称为“多重条件”,一种只具有NAME属性,称为“ “单名列表条件”和称为“列表条件”的最终“表单”,它只是CONDITION标记,根本没有属性。我之前链接到的AIML标记参考有三种形式的例子,但中间有很多单词,所以我将在这里重复它们,以及周围的AIML代码:

FORM:multi condition标签:

<category> 
    <pattern>I AM BLOND</pattern> 
    <template>You sound very 
    <condition name="gender" value="female"> attractive.</condition> 
    <condition name="gender" value="male"> handsome.</condition> 
    </template> 
</category> 

FORM:列表条件标签:

<category> 
    <pattern>I AM BLOND</pattern> 
    <template>You sound very 
    <condition> 
     <li name="gender" value="female"> attractive.</li> 
     <li name="gender" value="male"> handsome.</li> 
    </condition> 
    </template> 
</category> 

形式:单名称列表条件标签

<category> 
    <pattern>I AM BLOND</pattern> 
    <template>You sound very 
    <condition name="gender"> 
     <li value="female"> attractive.</li> 
     <li value="male"> handsome.</li> 
    </condition> 
    </template> 
</category> 

在以前版本的剧本是我的工作,只有“列表条件”中使用的条件标签的形式,虽然这是最常用的形式,它不是专门用于,所以我需要能够适应其他两种形式。所以我的问题是:

这是如何以有效的方式完成的?

我已经有工作代码来解析CONDITION标签的列表条件形式,并且prelimary测试看起来很有前途,因为它不会引发错误,并且似乎产生了所需的响应(但仅限于列表条件其他两种形式因错误而失败,原因很明显)。该功能列出如下:

function parse_condition_tag($convoArr, $element, $parentName, $level) 
{ 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Starting function and setting timestamp.', 2); 
    $response = array(); 
    $attrName = $element['name']; 
    if (!empty ($attrName)) 
    { 
    $attrName = ($attrName == '*') ? $convoArr['star'][1] : $attrName; 
    $search = $convoArr['client_properties'][$attrName]; 
    $path = ($search != 'undefined') ? "//li[@value=\"$search\"]" : '//li[[email protected]*]'; 
    $choice = $element->xpath($path); 
    $children = $choice[0]->children(); 
    if (!empty ($children)) 
    { 
     $response = parseTemplateRecursive($convoArr, $children, $level + 1); 
    } 
    else 
    { 
     $response[] = (string) $choice[0]; 
    } 
    $response_string = implode_recursive(' ', $response, __FILE__, __FUNCTION__, __LINE__); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "Returning '$response_string' and exiting function.", 4); 
    return $response_string; 
    } 
    trigger_error('Parsing of the CONDITION tag failed! XML = ' . $element->asXML()); 
} 

我对使用SimpleXML函数还比较陌生,所以我很可能会漏掉一些明显的东西。事实上,我希望情况正是如此。 :)

编辑:并称,我终于结束了,如许在我的意见之一,下面的功能:

/* 
    * function parse_condition_tag 
    * Acts as a de-facto if/else structure, selecting a specific output, based on certain criteria 
    * @param [array] $convoArr - The conversation array (a container for a number of necessary variables) 
    * @param [object] $element - The current XML element being parsed 
    * @param [string] $parentName - The parent tag (if applicable) 
    * @param [int] $level   - The current recursion level 
    * @return [string] $response_string 
    */ 

function parse_condition_tag($convoArr, $element, $parentName, $level) 
{ 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Starting function and setting timestamp.', 2); 
    global $error_response; 
    $response = array(); 
    $attrName = $element['name']; 
    $attributes = (array)$element->attributes(); 
    $attributesArray = (isset($attributes['@attributes'])) ? $attributes['@attributes'] : array(); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Element attributes:' . print_r($attributesArray, true), 1); 
    $attribute_count = count($attributesArray); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "Element attribute count = $attribute_count", 1); 
    if ($attribute_count == 0) // Bare condition tag 
    { 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Parsing a CONDITION tag with no attributes. XML = ' . $element->asXML(), 2); 
    $liNamePath = 'li[@name]'; 
    $condition_xPath = ''; 
    $exclude = array(); 
    $choices = $element->xpath($liNamePath); 
    foreach ($choices as $choice) 
    { 
     $choice_name = (string)$choice['name']; 
     if (in_array($choice_name, $exclude)) continue; 
     $exclude[] = $choice_name; 
     runDebug(__FILE__, __FUNCTION__, __LINE__, 'Client properties = ' . print_r($convoArr['client_properties'], true), 2); 
     $choice_value = get_client_property($convoArr, $choice_name); 
     $condition_xPath .= "li[@name=\"$choice_name\"][@value=\"$choice_value\"]|"; 
    } 
    $condition_xPath .= 'li[not(@*)]'; 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "xpath search = $condition_xPath", 4); 
    $pick_search = $element->xpath($condition_xPath); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Pick array = ' . print_r($pick_search, true), 2); 
    $pick_count = count($pick_search); 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "Pick count = $pick_count.", 2); 
    $pick = $pick_search[0]; 
    } 
    elseif (array_key_exists('value', $attributesArray) or array_key_exists('contains', $attributesArray) or array_key_exists('exists', $attributesArray)) // condition tag with either VALUE, CONTAINS or EXISTS attributes 
    { 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Parsing a CONDITION tag with 2 attributes.', 2); 
    $condition_name = (string)$element['name']; 
    $test_value = get_client_property($convoArr, $condition_name); 
    switch (true) 
    { 
     case (isset($element['value'])): 
     $condition_value = (string)$element['value']; 
     break; 
     case (isset($element['value'])): 
     $condition_value = (string)$element['value']; 
     break; 
     case (isset($element['value'])): 
     $condition_value = (string)$element['value']; 
     break; 
     default: 
     runDebug(__FILE__, __FUNCTION__, __LINE__, 'Something went wrong with parsing the CONDITION tag. Returning the error response.', 1); 
     return $error_response; 
    } 
    $pick = ($condition_value == $test_value) ? $element : ''; 
    } 
    elseif (array_key_exists('name', $attributesArray)) // this ~SHOULD~ just trigger if the NAME value is present, and ~NOT~ NAME and (VALUE|CONTAINS|EXISTS) 
    { 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Parsing a CONDITION tag with only the NAME attribute.', 2); 
    $condition_name = (string)$element['name']; 
    $test_value = get_client_property($convoArr, $condition_name); 
    $path = "li[@value=\"$test_value\"]|li[not(@*)]"; 
    runDebug(__FILE__, __FUNCTION__, __LINE__, "search string = $path", 4); 
    $choice = $element->xpath($path); 
    $pick = $choice[0]; 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'Found a match. Pick = ' . print_r($choice, true), 4); 
    } 
    else // nothing matches 
    { 
    runDebug(__FILE__, __FUNCTION__, __LINE__, 'No matches found. Returning default error response.', 1); 
    return $error_response; 
    } 
    $children = (is_object($pick)) ? $pick->children() : null; 
    if (!empty ($children)) 
    { 
    $response = parseTemplateRecursive($convoArr, $children, $level + 1); 
    } 
    else 
    { 
    $response[] = (string) $pick; 
    } 
    $response_string = implode_recursive(' ', $response); 
    return $response_string; 
} 

我怀疑有可能是一个更好的,更优雅的方式做这件事(我的生活的故事,真的),但上面的工作是按照预期进行的。任何有关改进的建议都会受到感谢并予以认真考虑。

+0

这里请注意,我不是找人来“做功课的我”,而是一个(并非如此)轻轻推动正确的方向。也就是说,代码示例仍然受欢迎,但不是必需的。 :) – 2013-02-22 01:43:08

请注意,我没有使用SimpleXML,因为恕我直言,DOMDocument只是太好,waaay更强大。自PHP5以来,DOMDocumentDOMXPath都可用。

我创建了一个简单的解析器类解析提供的文档来获取条件,不同的风格:

class AIMLParser 
{ 
    public function parse($data) 
    { 
     $internalErrors = libxml_use_internal_errors(true); 

     $dom = new DOMDocument(); 
     $dom->loadHTML($data); 
     $xpath = new DOMXPath($dom); 

     $templates = array(); 

     foreach($xpath->query('//template') as $templateNode) { 
      $template = array(
       'text' => $templateNode->firstChild->nodeValue, // note this expects the first child note to always be the textnode 
       'conditions' => array(), 
      ); 

      foreach ($templateNode->getElementsByTagName('condition') as $condition) { 
       if ($condition->hasAttribute('name') && $condition->hasAttribute('value')) { 
        $template['conditions'] = $this->parseConditionsWithoutChildren($template['conditions'], $condition); 
       } elseif ($condition->hasAttribute('name')) { 
        $template['conditions'] = $this->parseConditionsWithNameAttribute($template['conditions'], $condition); 
       } else { 
        $template['conditions'] = $this->parseConditionsWithoutAttributes($template['conditions'], $condition); 
       } 
      } 

      $templates[] = $template; 
     } 

     libxml_use_internal_errors($internalErrors); 

     return $templates; 
    } 

    private function parseConditionsWithoutChildren(array $conditions, DOMNode $condition) 
    { 
     if (!array_key_exists($condition->getAttribute('name'), $conditions)) { 
      $conditions[$condition->getAttribute('name')] = array(); 
     } 

     $conditions[$condition->getAttribute('name')][$condition->getAttribute('value')] = $condition->nodeValue; 

     return $conditions; 
    } 

    private function parseConditionsWithNameAttribute(array $conditions, DOMNode $condition) 
    { 
     if (!array_key_exists($condition->getAttribute('name'), $conditions)) { 
      $conditions[$condition->getAttribute('name')] = array(); 
     } 

     foreach ($condition->getElementsByTagName('li') as $listItem) { 
      $conditions[$condition->getAttribute('name')][$listItem->getAttribute('value')] = $listItem->nodeValue; 
     } 

     return $conditions; 
    } 

    private function parseConditionsWithoutAttributes(array $conditions, DOMNode $condition) 
    { 
     foreach ($condition->getElementsByTagName('li') as $listItem) { 
      if (!array_key_exists($listItem->getAttribute('name'), $conditions)) { 
       $conditions[$listItem->getAttribute('name')] = array(); 
      } 

      $conditions[$listItem->getAttribute('name')][$listItem->getAttribute('value')] = $listItem->nodeValue; 
     } 

     return $conditions; 
    } 
} 

它所做的是它搜索文档template节点,并通过他们的循环。对于每个template节点,它找出条件是什么风格。基于它选择了条件的正确解析函数。循环遍历所有模板后,它会返回一个解析数组,其中包含您需要的所有信息(我认为)。

要分析一些文件,你可以这样做:

$parser = new AIMLParser(); 
$templates = $parser->parse($someVariableWithTheContentOfTheDocument); 

演示:http://codepad.viper-7.com/JPuBaE

+0

ORLY? Downvote? – PeeHaa 2013-02-23 16:25:15

+0

虽然这个函数不适合脚本的其余部分,但它**确实给了我一个调查的方向,所以+1。如果确实如此,确实会导致我需要的东西,我一定会将其标记为“答案”。谢谢。 :) – 2013-02-23 16:27:09

+1

虽然我没有最终使用上面发布的任何示例代码,但它足以让我“向正确的方向推动”,以便将其限定为答案。我仍在充实代码,但是当我对它满意时,我会在这里发布它,以便其他人可以受益。再次感谢@PeeHaa。 – 2013-02-24 06:24:11