This node sends the following acoustic signal results by socket communication.
Acoustic signal
Frequency spectrum after STFT
Source information of source localization result
Acoustic feature
Missing Feature Mask
No files are required.
When to use
This node is used to send the above data to a system external to HARK using TCP/IP communication.
Typical connection
In the example in Figure 6.10, all input terminals are connected. It is also possible to leave input terminals open depending on the transmitted data. To learn about the relation between the connection of the input terminals and transmitted data, see “Details of the node”.
Parameter name |
Type |
Default value |
Unit |
Description |
HOST |
localhost |
Host name /IP address of the server to which data is sent |
||
PORT |
8890 |
Port number for outbound network communication |
||
ADVANCE |
160 |
[pt] |
Shift length of frame |
|
BUFFER_SIZE |
512 |
Size of allocated float-sized memory for socket communication |
||
FRAMES_PER_SEND |
1 |
[frm] |
Frequency of socket communication in frame unit |
|
TIMESTAMP_TYPE |
GETTIMEOGDAY |
Time stamped to the sent data |
||
SAMPLING_RATE |
16000 |
[Hz] |
Sampling frequency |
|
DEBUG_PRINT |
false |
ON/OFF for outputting debugging information |
||
SOCKET_ENABLE |
true |
Flag to determine whether or not to perform the socket output |
Input
: Matrix<float> type. Acoustic signal (The number of channels $\times $ acoustic signal of window length size STFT in each channel)
: Matrix<complex<float> > type. Frequency spectrum (The number of channels $\times $ spectrum of each channel)
: Vector<ObjectRef> type. Source information on the source localization results of several sound sources
: Map<int, ObjectRef> type. A sound source ID and acoustic signal (Vector<float> type) data pair.
: Map<int, ObjectRef> type. A sound source ID and frequency spectrum (Vector<complex<float> > type) data pair.
: Map<int, ObjectRef> type. A sound source ID and acoustic feature (Vector<float> type) data pair.
: Map<int, ObjectRef> type. A sound source ID and mask vector (Vector<float> type) data pair.
Output
: ObjectRef type. Same output as the input.
Parameter
: string type. IP address of a host to which data is transmitted. It is invalid when SOCKET_ENABLED is set to false.
: int type. Socket number. It is invalid when SOCKET_ENABLED is set to false.
: int type. Shift length of a frame. It must be equal to the value set in previous processing.
: int type. Buffer size secured for socket communication.
: int type. Frequency of socket communication in frame unit.
: string type. Setting for time stamped to sent data. If TIMESTAMP_TYPE=GETTIMEOFDAY, the time taken by gettimeofday is stamped. If TIMESTAMP_TYPE=CONSTANT_INCREMENT, the frame time calculated by SAMPLING_RATE is incremented to the stamped current time.
: int type. Sampling frequency of the input signal. This is valid only when TIMESTAMP_TYPE=CONSTANT_INCREMENT.
: bool type. ON/OFF of debug to standard output.
: bool type. Data is transferred to the socket when true and not transferred when false.
Description of the parameters
For HOST, designate a host name or an IP address of the host running an external program to transmit data. For PORT, designate a network port number for data transmission. ADVANCE is the shift length of a frame and must be equal to the value set in previous processing. BUFFER_SIZE is a buffer size to be secured for socket communication. A float type array of BUFFER_SIZE * 1024 is secured at the time of initialization. It must be greater than the transmitted data. FRAMES_PER_SEND is the frequency of socket communication in frame unit. The default value is 1 and sufficient for the most cases, which sends data in every frame. If you want to reduce the amount of socket communication, increase this value. TIMESTAMP_TYPE is the setting for time stamped to sent data. SAMPLING_RATE is the sampling frequency of the input signal. DEBUG_PRINT indicates if debug to standard output should be displayed. This outputs some parts of the transmitted data. For more information, see “Debug” in Table 6.13. When SOCKET_ENABLED is set to false, data is not sent to external systems. This is used to perform a network operation check for HARK without operating an external program.
Details of data transmission
(B-1) Structure for data transmission
Data transmission is performed for each frame, being divided into some parts. The structures defined for data transmission are listed as follows.
HD_Header
Description: A header that contains basic information on top of the transmitted data
Data size: 3 * sizeof(int) + 2 * sizeof(int64)
Variable name |
Type |
Description |
type |
int |
Bit flag that indicates the structure of the transmitted data. |
For relations between each bit and data to be transmitted, see Table 6.8. |
||
advance |
int |
Shift length of a frame |
count |
int |
Frame number of HARK |
tv_sec |
int64 |
timestamp of HARK in seconds |
tv_usec |
int64 |
timestamp of HARK in micro-seconds |
Number of digits |
Related input terminal |
Transmit data |
The first column |
MIC_WAVE |
Acoustic signal |
The second column |
MIC_FFT |
Frequency spectrum |
The third column |
SRC_INFO |
Source localization result source information |
The fourth column |
SRC_INFO, SRC_WAVE |
Source localization result source information |
+ acoustic signal for each sound source ID |
||
The fifth column |
SRC_INFO, SRC_FFT |
Source localization result source information |
+ frequency spectrum for each sound source ID |
||
The sixth column |
SRC_INFO, SRC_FEATURE |
Source localization result source information |
+ acoustic feature for each sound source ID |
||
The seventh column |
SRC_INFO, SRC_RELIABILITY |
Source localization result source information |
+ missing feature mask for each sound source ID |
In HarkDataStreamSender , The transmitted data differs depending on whether the input terminal can be opened. On the receiving end, the transmitted data can be interpreted according to their types. Examples are given below. Further details on transmitted data are given in (B-2).
In the case that only the MIC_FFT input terminal is connected, the type is 0000010 in binary number. Moreover, the transmitted data becomes only a frequency spectrum for each microphone.
In the case that the three input terminals of MIC_WAVE, SRC_INFO and SRC_FEATURE are connected, the type is 0100101 in binary. The data to be transmitted are acoustic signals for each microphone, source information of a source localization result and acoustic features for each sound source ID.
For the four input terminals of SRC_WAVE, SRC_FFT, SRC_FEATURE and SRC_RELIABILITY, the data to be transmitted are information for each sound source ID and therefore information of SRC_INFO is required. Even if the above four input terminals are connected without connecting SRC_INFO, no data is transmitted. In such a case, the type is 0000000 in binary.
HDH_MicData
Description: Structural information on the array size for sending two-dimensional arrays
Data size: 3 * sizeof(int)
Variable name |
Type |
Description |
nch |
int |
Number of microphone channels |
length |
int |
Data length (number of columns of the two-dimensional array to be transmitted) |
data_bytes |
int |
Number of bytes of data to be transmitted. In the case of a float type matrix, |
nch * length * sizeof(float). |
HDH_SrcInfo
Description: Source information of a source location result
Data size: 1 * sizeof(int)+ 4 * sizeof(float)
Variable name |
Type |
Description |
src_id |
int |
Sound source ID |
x[3] |
float |
Three-dimensional position of sound source |
power |
float |
Power of the MUSIC spectrum calculated in LocalizeMUSIC |
HDH_SrcData
Description: Structural information on the array size for sending one-dimensional arrays
Data size: 2 * sizeof(int)
Variable name |
Type |
Description |
length |
int |
Data length (number of one-dimensional array elements to be transmitted) |
data_bytes |
int |
Number of bytes of transmitted data. In the case of a float type vector, length * sizeof(float). |
(B-2) Transmitted data
Details of the transmitted data |
Input terminal and transmitted data |
||||||||
Type |
Size |
MIC_WAVE |
MIC_FFT |
SRC_INFO |
SRC_WAVE |
SRC_FFT |
SRC_FEATURE |
SRC_RELIABILITY |
|
(a) |
HD_Header |
sizeof(HD_Header) |
$\circ $ |
$\circ $ |
$\circ $ |
$\circ $ |
$\circ $ |
$\circ $ |
$\circ $ |
(b) |
HDH_MicData |
sizeof(HDH_MicData) |
$\circ $ |
||||||
(c) |
float[] |
HDH_MicData.data_bytes |
$\circ $ |
||||||
(d) |
HDH_MicData |
sizeof(HDH_MicData) |
$\circ $ |
||||||
(e) |
float[] |
HDH_MicData.data_bytes |
$\circ $ |
||||||
(f) |
float[] |
HDH_MicData.data_bytes |
$\circ $ |
||||||
(g) |
int |
1 * sizeof(int) |
$\circ $ |
$\circ ^*$ |
$\circ ^*$ |
$\circ ^*$ |
$\circ ^*$ |
||
(h) |
HDH_SrcInfo |
sizeof(HDH_SrcInfo) |
$\circ $ |
$\circ ^*$ |
$\circ ^*$ |
$\circ ^*$ |
$\circ ^*$ |
||
(i) |
HDH_SrcData |
sizeof(HDH_SrcData) |
$\circ ^*$ |
||||||
(j) |
short int[] |
HDH_SrcData.data_bytes |
$\circ ^*$ |
||||||
(k) |
HDH_SrcData |
sizeof(HD_SrcData) |
$\circ ^*$ |
||||||
(l) |
float[] |
HDH_SrcData.data_bytes |
$\circ ^*$ |
||||||
(m) |
float[] |
HDH_SrcData.data_bytes |
$\circ ^*$ |
||||||
(n) |
HDH_SrcData |
sizeof(HD_SrcData) |
$\circ ^*$ |
||||||
(o) |
float[] |
HDH_SrcData.data_bytes |
$\circ ^*$ |
||||||
(p) |
HDH_SrcData |
sizeof(HD_SrcData) |
$\circ ^*$ |
||||||
(q) |
float[] |
HDH_SrcData.data_bytes |
$\circ ^*$ |
Description |
Debug |
|
(a) |
Transmitted data header. See Table 6.7. |
$\circ $ |
(b) |
Structure of acoustic signals |
$\circ $ |
(number of microphones, frame length, byte count for transmission). See Table 6.9. |
||
(c) |
Acoustic signal (number of microphones $\times $ float type matrix of frame length) |
|
(d) |
Structure of frequency spectra |
$\circ $ |
(number of microphones, number of frequency bins, byte count for transmission). See Table 6.9. |
||
(e) |
Real part of frequency spectrum |
|
(number of microphones $\times $ float type matrix of number of frequency bins) |
||
(f) |
Imaginary part of frequency spectrum |
|
(number of microphones $\times $ float type matrix of number of frequency bins) |
||
(g) |
Number of sound sources detected |
$\circ $ |
(h) |
Source of a source location result. See Table 6.10. |
$\circ $ |
(i) |
Structure that indicates that of acoustic signals for each sound source ID |
$\circ $ |
(frame length, byte count for transmission). See Table 6.11. |
||
(j) |
Acoustic signal for each sound source ID (float type linear array of frame length) |
|
(k) |
Structure that indicates that of frequency spectra for each sound source ID |
$\circ $ |
(number of frequency bins, byte count for transmission). See Table 6.11. |
||
(l) |
Real part of a frequency spectrum for each sound source ID |
|
(float type linear array of number of frequency bins) |
||
(m) |
Imaginary part of a frequency spectrum for each sound source ID |
|
(float type linear array of number of frequency bins) |
||
(n) |
Structure that indicates that of acoustic features for each sound source ID |
$\circ $ |
(dimension number of features, byte count for transmission). See Table 6.11. |
||
(o) |
Acoustic feature for each sound source ID (float type linear array of dimension number of features) |
|
(p) |
Structure that indicates that of MFM for each sound source ID |
$\circ $ |
(dimension number of features, byte count for transmission). See Table 6.11. |
||
(q) |
MFM for each sound source ID (float type linear array of dimension number of features) |
Transmitted data is divided for each frame as shown in (a)-(q) of Tables 6.12 and 6.13. Table 6.12 shows the relation between the transmitted data (a)-(q) and the input terminal connected, and Table 6.13 describes the transmitted data.
calculate{ Send (a) IF MIC_WAVE is connected Send (b) Send (c) ENDIF IF MIC_FFT is connected Send (d) Send (e) Send (f) ENDIF IF SRC_INFO is connected Send (g) (Let the number of sounds ’src_num’.) FOR i = 1 to src_num (This is a sound ID based routine.) Send (h) IF SRC_WAVE is connected Send (i) Send (j) ENDIF IF SRC_FFT is connected Send (k) Send (l) Send (m) ENDIF IF SRC_FEATURE is connected Send (n) Send (o) ENDIF IF SRC_RELIABILITY is connected Send (p) Send (q) ENDIF ENDFOR ENDIF}