HTML-XML-utils consists of a set of small C programs (filters) that read HTML and XML files and can add a table of contents, an alphabetical index, a bibliography, cross-references, numbered headings, remove elements, count elements, pretty-print them, etc. When it reads HTML, it assumes the code is correct HTML 4.0 or close to it.
Below are the sets of utilities included:
asc2xml - convert from UTF-8 to &#nnn; entities
xml2asc - convert from &#nnn; entities to UTF-8
hxaddid - add IDs to selected elements
hxcite - replace bibliographic references by hyperlinks
hxcite-mkbib - expand references and create bibliography
hxclean - apply heuristics to correct an HTML file
hxcopy - copy an HTML file while preserving relative links
hxcount - count elements and attributes in HTML or XML files
hxextract - extract selected elements
hxincl - expand included HTML or XML files
hxindex - create an alphabetically sorted index
hxmkbib - create bibliography from a template
hxmultitoc - create a table of contents for a set of HTML files
hxname2id - move some ID= or NAME= from A elements to their parents
hxnormalize - pretty-print an HTML file
hxnsxml - convert output of hxxmlns back to normal XML
hxnum - number section headings in an HTML file
hxpipe - convert XML to a format easier to parse with Perl or AWK
hxprintlinks - number links & add table of URLs at end of an HTML file
hxprune - remove marked elements from an HTML file
hxref - generate cross-references
hxselect - extract elements that match a (CSS) selector
hxtoc - insert a table of contents in an HTML file
hxuncdata - replace CDATA sections by character entities
hxunent - replace HTML predefined character entities to UTF-8
hxunpipe - convert output of pipe back to XML format
hxunxmlns - replace "global names" by XML Namespace prefixes
hxwls - list links in an HTML file
hxxmlns - replace XML Namespace prefixes by "global names"
HTML-XML-utils Installation:
Open the terminal and type following command:
Below are the sets of utilities included:
asc2xml - convert from UTF-8 to &#nnn; entities
xml2asc - convert from &#nnn; entities to UTF-8
hxaddid - add IDs to selected elements
hxcite - replace bibliographic references by hyperlinks
hxcite-mkbib - expand references and create bibliography
hxclean - apply heuristics to correct an HTML file
hxcopy - copy an HTML file while preserving relative links
hxcount - count elements and attributes in HTML or XML files
hxextract - extract selected elements
hxincl - expand included HTML or XML files
hxindex - create an alphabetically sorted index
hxmkbib - create bibliography from a template
hxmultitoc - create a table of contents for a set of HTML files
hxname2id - move some ID= or NAME= from A elements to their parents
hxnormalize - pretty-print an HTML file
hxnsxml - convert output of hxxmlns back to normal XML
hxnum - number section headings in an HTML file
hxpipe - convert XML to a format easier to parse with Perl or AWK
hxprintlinks - number links & add table of URLs at end of an HTML file
hxprune - remove marked elements from an HTML file
hxref - generate cross-references
hxselect - extract elements that match a (CSS) selector
hxtoc - insert a table of contents in an HTML file
hxuncdata - replace CDATA sections by character entities
hxunent - replace HTML predefined character entities to UTF-8
hxunpipe - convert output of pipe back to XML format
hxunxmlns - replace "global names" by XML Namespace prefixes
hxwls - list links in an HTML file
hxxmlns - replace XML Namespace prefixes by "global names"
HTML-XML-utils Installation:
Open the terminal and type following command:
sudo apt-get install HTML-XML-utils
0 comments:
Post a Comment