Working with XML using standard Unix tools
Classified in : Homepage, Debian, Command line, To remember
Like it or not, XML has been used everywhere, even in cases where text-based formats would have been sufficient. Unfortunately, standard tools such as grep, sed or awk are not really adapted to work with XML. Let us take the following example:
<chapter xmlns="http://docbook.org/ns/docbook" version="5.0"> <title>The Debian distribution</title> <para>Debian is a free operating system, describing itself as “the universal operating system”. It is mostly known as a GNU/Linux distribution, but it also exist in other variants such as GNU/Hurd and GNU/kFreeBSD…</para> </chapter>
grep --only-matching
Classified in : Homepage, Debian, Command line, To remember
grep is designed to print lines matching a given pattern, but I often need to print only the matching part, discarding the remaining.
I used to do that with sed, but it involves several actions: match, replace the line by only the matching pattern and print. Fortunately, GNU grep has an option to do just that:
-o
,--only-matching
- Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
Unfortunately it is not a standard option, so it may be missing on non-GNU systems.