题目描述本题要求编写一个程序对用HTML\texttt{HTML}HTML编写的简单文本进行语法验证。我们不考虑这些文档的语义只考虑简化的语法规则。在这些文档中你会找到普通的文本行长度任意其中穿插着标记标签。我们考虑的标记标签总是成对出现。标记标签很容易识别因为它们总是出现在尖括号中即和。我们考虑的标签总是由不超过101010个大写字母组成。标签影响的文本区域结束由以正斜杠/开头的同名标签指示如/TAG。标签区域可能跨越多行文本。HTML\texttt{HTML}HTML标签必须正确嵌套就像Pascal\texttt{Pascal}Pascal中的BEGIN...END对或C/C\texttt{C/C}C/C中的{和}一样。输入格式输入包含多个测试用例。每个测试用例以一行包含整数NLNLNL开始表示测试用例中的文本行数。NLNLNL不会超过327673276732767。输入结束以NL0NL 0NL0标记。在指定NLNLNL的行之后有NLNLNL行需要检查语法合规性的文本。注意没有最大行长度限制。输出格式对于每个测试用例输出测试用例编号从111开始编号。如果文本符合上述规则输出OK。如果文本有错误仅输出第一个错误的以下消息之一line #: bad character in tag nameline #: too many/few characters in tag nameline #: expected /xxxxxxxxxxline #: no matching begin tag其中#应替换为检测到违规标签时的行号。如果检测到错误输出适当的错误消息后程序必须跳过错误测试用例中的任何剩余行以到达下一个测试用例的开头。样例输入6 This is some ordinary text. BEGIN This is included in the BEGIN tag /BEGIN START Heres some stuff and so is this more stuff./START 2 This has a null tag And an extra line after the error 5 This has some good stuff OKAY and some bad stuff later on. GOOD All is still okay, but later on well have an error. /GOOD Were still in the pink! THISISTOOLONG This line will be skipped. As will this one. 1 This is an interesting error: ERROR 2 This one is okay IN/IN 1 Mismatch START/STOP 1 Missing start symbol: OK/OK/NOTOK more garbage... 1 ELEVENCharsBoth too long and invalid letter/ELEVENChars 1 ELEVENCHARSBoth too long and invalid letter/ELEVENCHARS 1 ELEVENCHARS!Both too long and invalid letter/ELEVENCHARS! 0样例输出Test Case 1 OK Test Case 2 line 1: too many/few characters in tag name. Test Case 3 line 3: too many/few characters in tag name. Test Case 4 line 1: bad character in tag name. Test Case 5 OK Test Case 6 line 1: expected /START Test Case 7 line 1: no matching begin tag. Test Case 8 line 1: bad character in tag name. Test Case 9 line 1: bad character in tag name. Test Case 10 line 1: too many/few characters in tag name.题目分析问题的本质这是一个HTML/XML\texttt{HTML/XML}HTML/XML标签语法检查问题。需要验证标签的格式和嵌套正确性。语法规则标签以开头以结尾结束标签以/开头标签名由111到101010个大写字母组成标签必须正确嵌套先开后闭后进先出普通文本可以包含任意字符包括空格、换行等错误类型错误类型触发条件bad character in tag name标签名中包含非大写字母字符包括数字、小写字母、符号等too many/few characters in tag name标签名为空或长度超过101010expected /xxxx遇到结束标签但栈顶的起始标签名不匹配no matching begin tag遇到结束标签但栈为空没有对应的起始标签文件结束标签未闭合处理完所有行后栈非空关键点只报告第一个错误检测到错误后跳过当前测试用例的剩余行标签名只包含大写字母A∼ZA \sim ZA∼Z标签名长度1∼101 \sim 101∼10参考代码// HTML Syntax Checking// UVa ID: 342// Verdict: Accepted// Submission Date: 2016-07-08// UVa Run Time: 0.070s//// 版权所有C2016邱秋。metaphysis # yeah dot net#includebits/stdc.husingnamespacestd;intmain(intargc,char*argv[]){ios::sync_with_stdio(false);intn,cases0;while(cinn,n){// 跳过换行符准备逐字符读取cin.ignore(1024,\n);// 不跳过空白字符以便读取空格和换行cin.unsetf(ios::skipws);coutTest Case casesendl;intlines1;// 当前行号charletter;boolerrorfalse;// 是否已发生错误stackstringtags;// 标签栈用于验证嵌套// 逐字符读取while(cinletter){// 换行处理if(letter\n){lines;if(linesn)break;// 已读完当前测试用例的所有行}// 已发生错误跳过剩余输入if(error)continue;// 遇到标签开始if(letter){string tag;boolclosingfalse;// 解析标签内容while(cinletter){if(letter\n){// 标签未闭合换行在 之前coutline lines: bad character in tag name.endl;errortrue;lines;break;}elseif(letter/){if(closingfalse){closingtrue;continue;}else{// 结束标签中出现第二个 /coutline lines: bad character in tag name.endl;errortrue;break;}}elseif(letter){break;// 标签结束}elseif(letterA||letterZ){// 非大写字母coutline lines: bad character in tag name.endl;errortrue;break;}tagletter;// 标签名过长if(tag.length()10){coutline lines: too many/few characters in tag name.endl;errortrue;break;}}// 如果没有错误验证嵌套if(errorfalse){// 空标签名if(tag.length()0){coutline lines: too many/few characters in tag name.endl;errortrue;}elseif(closing){// 结束标签if(tags.empty()){coutline lines: no matching begin tag.endl;errortrue;}elseif(tags.top()!tag){coutline lines: expected /tags.top()endl;errortrue;}else{tags.pop();}}else{// 起始标签tags.push(tag);}}if(linesn)break;}}// 文件结束检查是否有未闭合的标签if(errorfalsetags.empty()false){coutline (lines-1): expected /tags.top()endl;errortrue;}if(!error)coutOKendl;// 恢复跳过空白字符的设置cin.setf(ios::skipws);}return0;}
UVa 342 HTML Syntax Checking
题目描述本题要求编写一个程序对用HTML\texttt{HTML}HTML编写的简单文本进行语法验证。我们不考虑这些文档的语义只考虑简化的语法规则。在这些文档中你会找到普通的文本行长度任意其中穿插着标记标签。我们考虑的标记标签总是成对出现。标记标签很容易识别因为它们总是出现在尖括号中即和。我们考虑的标签总是由不超过101010个大写字母组成。标签影响的文本区域结束由以正斜杠/开头的同名标签指示如/TAG。标签区域可能跨越多行文本。HTML\texttt{HTML}HTML标签必须正确嵌套就像Pascal\texttt{Pascal}Pascal中的BEGIN...END对或C/C\texttt{C/C}C/C中的{和}一样。输入格式输入包含多个测试用例。每个测试用例以一行包含整数NLNLNL开始表示测试用例中的文本行数。NLNLNL不会超过327673276732767。输入结束以NL0NL 0NL0标记。在指定NLNLNL的行之后有NLNLNL行需要检查语法合规性的文本。注意没有最大行长度限制。输出格式对于每个测试用例输出测试用例编号从111开始编号。如果文本符合上述规则输出OK。如果文本有错误仅输出第一个错误的以下消息之一line #: bad character in tag nameline #: too many/few characters in tag nameline #: expected /xxxxxxxxxxline #: no matching begin tag其中#应替换为检测到违规标签时的行号。如果检测到错误输出适当的错误消息后程序必须跳过错误测试用例中的任何剩余行以到达下一个测试用例的开头。样例输入6 This is some ordinary text. BEGIN This is included in the BEGIN tag /BEGIN START Heres some stuff and so is this more stuff./START 2 This has a null tag And an extra line after the error 5 This has some good stuff OKAY and some bad stuff later on. GOOD All is still okay, but later on well have an error. /GOOD Were still in the pink! THISISTOOLONG This line will be skipped. As will this one. 1 This is an interesting error: ERROR 2 This one is okay IN/IN 1 Mismatch START/STOP 1 Missing start symbol: OK/OK/NOTOK more garbage... 1 ELEVENCharsBoth too long and invalid letter/ELEVENChars 1 ELEVENCHARSBoth too long and invalid letter/ELEVENCHARS 1 ELEVENCHARS!Both too long and invalid letter/ELEVENCHARS! 0样例输出Test Case 1 OK Test Case 2 line 1: too many/few characters in tag name. Test Case 3 line 3: too many/few characters in tag name. Test Case 4 line 1: bad character in tag name. Test Case 5 OK Test Case 6 line 1: expected /START Test Case 7 line 1: no matching begin tag. Test Case 8 line 1: bad character in tag name. Test Case 9 line 1: bad character in tag name. Test Case 10 line 1: too many/few characters in tag name.题目分析问题的本质这是一个HTML/XML\texttt{HTML/XML}HTML/XML标签语法检查问题。需要验证标签的格式和嵌套正确性。语法规则标签以开头以结尾结束标签以/开头标签名由111到101010个大写字母组成标签必须正确嵌套先开后闭后进先出普通文本可以包含任意字符包括空格、换行等错误类型错误类型触发条件bad character in tag name标签名中包含非大写字母字符包括数字、小写字母、符号等too many/few characters in tag name标签名为空或长度超过101010expected /xxxx遇到结束标签但栈顶的起始标签名不匹配no matching begin tag遇到结束标签但栈为空没有对应的起始标签文件结束标签未闭合处理完所有行后栈非空关键点只报告第一个错误检测到错误后跳过当前测试用例的剩余行标签名只包含大写字母A∼ZA \sim ZA∼Z标签名长度1∼101 \sim 101∼10参考代码// HTML Syntax Checking// UVa ID: 342// Verdict: Accepted// Submission Date: 2016-07-08// UVa Run Time: 0.070s//// 版权所有C2016邱秋。metaphysis # yeah dot net#includebits/stdc.husingnamespacestd;intmain(intargc,char*argv[]){ios::sync_with_stdio(false);intn,cases0;while(cinn,n){// 跳过换行符准备逐字符读取cin.ignore(1024,\n);// 不跳过空白字符以便读取空格和换行cin.unsetf(ios::skipws);coutTest Case casesendl;intlines1;// 当前行号charletter;boolerrorfalse;// 是否已发生错误stackstringtags;// 标签栈用于验证嵌套// 逐字符读取while(cinletter){// 换行处理if(letter\n){lines;if(linesn)break;// 已读完当前测试用例的所有行}// 已发生错误跳过剩余输入if(error)continue;// 遇到标签开始if(letter){string tag;boolclosingfalse;// 解析标签内容while(cinletter){if(letter\n){// 标签未闭合换行在 之前coutline lines: bad character in tag name.endl;errortrue;lines;break;}elseif(letter/){if(closingfalse){closingtrue;continue;}else{// 结束标签中出现第二个 /coutline lines: bad character in tag name.endl;errortrue;break;}}elseif(letter){break;// 标签结束}elseif(letterA||letterZ){// 非大写字母coutline lines: bad character in tag name.endl;errortrue;break;}tagletter;// 标签名过长if(tag.length()10){coutline lines: too many/few characters in tag name.endl;errortrue;break;}}// 如果没有错误验证嵌套if(errorfalse){// 空标签名if(tag.length()0){coutline lines: too many/few characters in tag name.endl;errortrue;}elseif(closing){// 结束标签if(tags.empty()){coutline lines: no matching begin tag.endl;errortrue;}elseif(tags.top()!tag){coutline lines: expected /tags.top()endl;errortrue;}else{tags.pop();}}else{// 起始标签tags.push(tag);}}if(linesn)break;}}// 文件结束检查是否有未闭合的标签if(errorfalsetags.empty()false){coutline (lines-1): expected /tags.top()endl;errortrue;}if(!error)coutOKendl;// 恢复跳过空白字符的设置cin.setf(ios::skipws);}return0;}