Difference between revisions of "Xml files"
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | == Introduction == | ||
+ | |||
Many files exported and imported in Geopsy softwares are based on [[Wikipedia::xml|XML]] structures. For instance, (listed with their common extension): | Many files exported and imported in Geopsy softwares are based on [[Wikipedia::xml|XML]] structures. For instance, (listed with their common extension): | ||
Line 10: | Line 12: | ||
* param | * param | ||
* target | * target | ||
+ | |||
+ | == Decompression == | ||
Those files are compressed [[Wikipedia::tar|TAR]] archives containing at least one file called ''contents.xml''. | Those files are compressed [[Wikipedia::tar|TAR]] archives containing at least one file called ''contents.xml''. | ||
Line 15: | Line 19: | ||
tar xvfz [[Media:Legend_page.page|Legend_page.page]] | tar xvfz [[Media:Legend_page.page|Legend_page.page]] | ||
− | + | Note that option 'z' might not be necessary if your browser recognizes ''page'' files as compressed ''gz'' files and decompresses it silently. Anyhow it produces a file ''contents.xml''. Some files may also contain binary files named ''bin_data_1*''. The format of those files is currently undocumented, refer to the source code for details. | |
+ | |||
+ | == Visualization == | ||
+ | |||
+ | ''contents.xml'' can be viewed by any Internet browser (e.g. [http://www.mozilla.org/firefox Firefox]). | ||
− | <SciFigs> | + | <'''SciFigs'''> |
− | <libVersion>2.3.0</libVersion> | + | <'''libVersion'''>2.3.0</'''libVersion'''> |
− | <type>Page</type> | + | <'''type'''>Page</'''type'''> |
− | <GraphicSheet> | + | <'''GraphicSheet'''> |
− | <LegendWidget objectName="object"> | + | <'''LegendWidget objectName='''"object"> |
− | <objectName>object</objectName> | + | <'''objectName'''>object</'''objectName'''> |
− | <printX>1</printX> | + | <'''printX'''>1</'''printX'''> |
− | <printY>1</printY> | + | <'''printY'''>1</'''printY'''> |
− | <anchor>TopLeft</anchor> | + | <'''anchor'''>TopLeft</'''anchor'''> |
... | ... | ||
− | To | + | == Edition == |
+ | |||
+ | The encoding of ''contenst.xml'' is [[Wikipedia:UTF-16|UTF-16]] which may not be directly editable in a text editor that does not support Unicode. However most modern editors does: | ||
+ | * [http://notepad-plus-plus.org Notepad++] (Windows only): UTF-16 is automatically recognized. | ||
+ | * [http://kwww.vim.org Vim]: on most systems, UTF-16 is automatically recognized. If not follow [[#ASCII Conversion|instructions below]]. | ||
+ | * [http://kate-editor.org Kate or KWrite]: UTF-16 is not automatically recognized up to version 4.4 (not tested above), but it can be specified manually in the menu (''Tools/Encoding/Unicode/UTF-16'') or through the command line | ||
+ | kate contents.xml --encoding UTF-16 | ||
+ | kwrite contents.xml --encoding UTF-16 | ||
+ | * [http://www.nano-editor.org nano]: UTF-16 might be supported, but not successful on a platform with LC_ALL=en_US.UTF8 (version 2.2.4). | ||
+ | * Notepad and Wordpad (Windows only): UTF-16 is not supported. Follow [[#ASCII Conversion|instructions below]]. | ||
+ | |||
+ | == ASCII Conversion == | ||
+ | |||
+ | Converting to ASCII is useful if you do not have an editor that support UTF-16 or if [[Wikipedia:Bash (Unix_shell)|Bash]] commands (e.g. grep, awk, sed,...) does not support UTF-16. [[Wikipedia:iconv|iconv]] is used for encoding conversions: | ||
+ | |||
+ | iconv -f UTF-16 -t ASCII contents.xml > tmp; mv tmp contents.xml | ||
+ | |||
+ | Any special character (non US characters) are lost in this transformation. This is important only for titles and texts displayed for instance in ''page'' files, if another language than English is used. Note that playing with ''contents.xml'' does not alter the original file, all modifications can still be erased. | ||
+ | |||
+ | Now ''contents.xml'' can be manipulated as an ASCII file. | ||
+ | |||
+ | == Saving modifications or compression == | ||
+ | |||
+ | To pack back ''contents.xml'' to the original compressed file format, the complete archive must reconstructed: | ||
+ | |||
+ | tar cvfz Legend_page.page contents.xml | ||
+ | |||
+ | In this case, UTF-16, UTF-8 or ASCII are accepted. If binary files were present in the original file, they must be packed together. The order is not critical. | ||
+ | |||
+ | tar cvfz Legend_page.page contents.xml bin_data_* |
Latest revision as of 00:11, 7 September 2010
Contents
Introduction
Many files exported and imported in Geopsy softwares are based on XML structures. For instance, (listed with their common extension):
- page
- mkup
- layer
- cpanel
- ctparser
- gpy
- dinver
- param
- target
Decompression
Those files are compressed TAR archives containing at least one file called contents.xml.
tar xvfz Legend_page.page
Note that option 'z' might not be necessary if your browser recognizes page files as compressed gz files and decompresses it silently. Anyhow it produces a file contents.xml. Some files may also contain binary files named bin_data_1*. The format of those files is currently undocumented, refer to the source code for details.
Visualization
contents.xml can be viewed by any Internet browser (e.g. Firefox).
<SciFigs> <libVersion>2.3.0</libVersion> <type>Page</type> <GraphicSheet> <LegendWidget objectName="object"> <objectName>object</objectName> <printX>1</printX> <printY>1</printY> <anchor>TopLeft</anchor> ...
Edition
The encoding of contenst.xml is UTF-16 which may not be directly editable in a text editor that does not support Unicode. However most modern editors does:
- Notepad++ (Windows only): UTF-16 is automatically recognized.
- Vim: on most systems, UTF-16 is automatically recognized. If not follow instructions below.
- Kate or KWrite: UTF-16 is not automatically recognized up to version 4.4 (not tested above), but it can be specified manually in the menu (Tools/Encoding/Unicode/UTF-16) or through the command line
kate contents.xml --encoding UTF-16 kwrite contents.xml --encoding UTF-16
- nano: UTF-16 might be supported, but not successful on a platform with LC_ALL=en_US.UTF8 (version 2.2.4).
- Notepad and Wordpad (Windows only): UTF-16 is not supported. Follow instructions below.
ASCII Conversion
Converting to ASCII is useful if you do not have an editor that support UTF-16 or if Bash commands (e.g. grep, awk, sed,...) does not support UTF-16. iconv is used for encoding conversions:
iconv -f UTF-16 -t ASCII contents.xml > tmp; mv tmp contents.xml
Any special character (non US characters) are lost in this transformation. This is important only for titles and texts displayed for instance in page files, if another language than English is used. Note that playing with contents.xml does not alter the original file, all modifications can still be erased.
Now contents.xml can be manipulated as an ASCII file.
Saving modifications or compression
To pack back contents.xml to the original compressed file format, the complete archive must reconstructed:
tar cvfz Legend_page.page contents.xml
In this case, UTF-16, UTF-8 or ASCII are accepted. If binary files were present in the original file, they must be packed together. The order is not critical.
tar cvfz Legend_page.page contents.xml bin_data_*