-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
This is:
- a bug report
- a feature request
- not a usage question (ask them on https://stackoverflow.com/questions/tagged/phpword)
Expected Behavior
To find the correct language for the <span lang="DE">
Please describe the behavior you are expecting.
process the addHTML - indipentend which lang-code is in the
Current Behavior
no DOCX is generated
What is the current behavior?
phpWord crashes if a language code is not defined
Failure Information
PHP Fatal error: Uncaught InvalidArgumentException: DE is not a valid language code in /vendor/phpoffice/phpword/src/PhpWord/Style/Language.php:226
Please help provide information about the failure.
The source is a copy+past from a Outlook-eMail into a CKEditor. Stored in MySQL and retrieved into a ::addHTML()
<p class="MsoNormal"><strong><span lang="DE" style="color:windowtext; mso-fareast-language:DE-AT">
How to Reproduce
Use the code above
Please provide a code sample that reproduces the issue.
<?php
require __DIR__ . '/vendor/autoload.php';
$phpWord = new \PhpOffice\PhpWord\PhpWord();
$section = $phpWord->addSection();
foreach($arr_follow as $key_ticket => $followups) {
if (strlen($followups['contents']) > 0) {
$content = Tom_excerp_voll($followups['contents']);
$table->addRow();
$table->addCell()->addText($followups['date'],$TableCellStyle);
$table->addCell()->addText($followups['dauer'],$TableCellStyle);
$table->addCell()->addText($followups['author'],$TableCellStyle);
$zelle = $table->addCell(6600,$TableCellStyle);
\PhpOffice\PhpWord\Shared\Html::addHtml($zelle, $content,false,false,null,["font" => array("size" => 6)]);
}
}
function Tom_excerp_voll($text) {
// This are all Tags that should be removed to work proper :(
$pattern = array('/<div.*?>/','/<\/div>/','/<p.*?>/','/<\/p>/','/<a.*?>/','/<\/a>/','/<img.*?>/','/<strong>/','/<\/strong>/');
// $replace = array('','','','<br/>','','',''); // for testing
// $replace = array('','','','','','','','<div><b>','</b></div>'); // for testing - no success :(
$replace = array('','','','','','','','','');
$text = preg_replace($pattern,$replace, $text);
// To avoid a double line-break on <br />
$text = str_replace("<br />","</span><span>",'<span>'.$text.'</span>');
// This workaround changes the lang to a "valid" langcode !!!!!
$text = str_replace("lang=\"DE\"","lange='de-DE'",$text);
// To remove Mailaddresses like "Max Mustermann <[email protected]>"
// this kind of address is recogniced as a Tag and i have to remove it for DOM->loadXML()
$text = preg_replace("/<(\/)?([a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})>/","",$text);
// Remove all Spaces - sometimes there are a massive amount of Spaces in inside-Tables
$text = preg_replace('/(\>)\s*(\<)/m', '$1$2', $text);
// Remove all Tab´s - sometimes there are a massive amount of Tab´s in inside-Tables
$text = trim(preg_replace('/\t+/', '', $text));
return $text;
}
Context
The function "Tom_excerp_voll" is only for documentation which workarounds are currently implemented to get a "valid" Document.... (with the restriction that i miss the most of the styling (bold; italic;...) )
I think the main problem is the diversity of HTML-sources. In my case it´s an eMail that is copy+paste from Outlook. The "real" language-code is provided by "mso-fareast-language" in the style-element in this case (Outlook-Client) and not by lang=
- it holds only a short language code.
But phpWord did only check for a view language codes ('de-AT' is also not there) in lang=
.
I am sure there are more language codes as the 13 they are currently in phpword/src/PhpWord/Style/Language.php
:)
Now the Feature-Request :)
In my opinion it would be the best to have a fallback language-code that is declared in the settings (and customizable by the programmer) to avoid the crashes .... (or something else...)
- PHP version: 7.2.6
- PHPWord version: dev-master (+some hints from other tickets)