ark is an archive format to save any
Kaldi objects. ark can be flushed to and from unix pipe.
cat test.ark | copy-feats ark:- ark,t:- | less # Show the contents in the ark
- indicates standard input stream or output stream.
Kaldi has two major types: Matrix and Vector.
- Binary/Text - Float/Double Matrix: FM, DM
- Binary/Text - Float/Double Vector: FV, DV
As such, features are often stored in one of these two file types. For instance, when you extract i-vectors, they are stored as a matrix of floats (FM) and if you extract x-vectors, they are stored as vectors of float (FV). Often it may be required to convert features stored as FV to FM and vice-versa.
convert from FV to FM:
copy-vector --binary=false scp:exp/xvectors/xvector.scp ark,t:- | \ copy-matrix ark,t:- ark,scp:exp/xvectors/xvector_mat.ark,exp/xvectors/xvector_mat.scp
convert from FM to FV:
copy-matrix --binary=false scp:exp/ivectors/ivector.scp ark,t:- | \ copy-vector ark,t:- ark,scp:exp/ivectors/ivector_vec.ark,exp/ivectors/ivector_vec.scp