Selecting the 200 best models through command line
Posted: Tue Jan 15, 2008 6:28 pm
There are two options you can use to extract models from .report file: -n and -m.
It reads the .report sequentially. When it encounter a model with a misfit less than 0.7, it is output. When 200 models have been output, it stops. There may have more model with low misfit afterwards. They are ignored. Even, the best misfit may not be output.
To solve this issue, you have to sort and select yourself the models with option -i. The first step is to get the list of the very best models
First column is the model index, second if the number of parameters for each model, follow the value for each parameter (4 columns) and finally the misfit value. What we need is the first and the last column. We use awk to get them:
Then we sort by increasing misfit and we select only the first 200:
We extract only the indexes needed by gpdcreport to reference the models to output:
Finally select the type of output you want (-pm...):
You can also re-export to a new or an other existing report with option -report. The last operation is not working with current release. It will be available in next release (> 20080115).
Code: Select all
gpdcreport run_01.report -pm -n 200 -m 0.7
To solve this issue, you have to sort and select yourself the models with option -i. The first step is to get the list of the very best models
Code: Select all
gpdcreport run_01.report -pm
[...]
2547 4 414.864 1191.89 1243.14 2779.02 0.620958
2548 4 414.13 1194.27 1244.61 2780.87 0.620442
2549 4 414.199 1193.72 1244.14 2781.21 0.621037
Code: Select all
gpdcreport run_01.report -pm | awk '{nd=$2;print $(3+nd) " " $1}'
[...]
0.620958 2547
0.620442 2548
0.621037 2549
Code: Select all
gpdcreport run_01.report -pm | awk '{nd=$2;print $(3+nd) " " $1}' | sort -n -k 1 | head -n 200
[...]
0.680778 2171
0.680782 2320
0.680819 2334
Code: Select all
gpdcreport run_01.report -pm | awk '{nd=$2;print $(3+nd) " " $1}' | sort -n -k 1 | head -n 200 | awk '{print $2}'
[...]
2171
2320
22334
Code: Select all
gpdcreport run_01.report -pm | awk '{nd=$2;print $(3+nd) " " $1}' | sort -n -k 1 | head -n 200 | awk '{print $2}' | gpdcreport run_01.report -pm -i
2171 4 443.078 1068 1154.3 2739.32 0.680778
2320 4 444.308 1066.81 1157.04 2740.98 0.680782
2334 4 443.908 1069.52 1159.58 2743.61 0.680819