ark ark,t scp
ark
is an archive format to save any Kaldi objects
. ark can be flushed to and from unix pipe.
cat test.ark | copy-feats ark:- ark,t:- | less # Show the contents in the ark
-
indicates standard input stream or output stream.
s, cs, p
s
:keys是有序的cs
:按顺序访问数据 (程序不满足会崩溃)p
:忽略错误
FM & FV
Kaldi has two major types: Matrix and Vector.
- Binary/Text - Float/Double Matrix: FM, DM
- Binary/Text - Float/Double Vector: FV, DV
As such, features are often stored in one of these two file types. For instance, when you extract i-vectors, they are stored as a matrix of floats (FM) and if you extract x-vectors, they are stored as vectors of float (FV). Often it may be required to convert features stored as FV to FM and vice-versa.
convert from FV to FM:
copy-vector --binary=false scp:exp/xvectors/xvector.scp ark,t:- | \
copy-matrix ark,t:- ark,scp:exp/xvectors/xvector_mat.ark,exp/xvectors/xvector_mat.scp
convert from FM to FV:
copy-matrix --binary=false scp:exp/ivectors/ivector.scp ark,t:- | \
copy-vector ark,t:- ark,scp:exp/ivectors/ivector_vec.ark,exp/ivectors/ivector_vec.scp