Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND SYSTEM FOR SEPARATION OF SOUNDS FROM DIFFERENT SOURCES
Document Type and Number:
WIPO Patent Application WO/2021/107941
Kind Code:
A1
Abstract:
The present disclosure provides a system for sound separation, comprising a sound collector for collecting a physiological signal and a sound separation device. The sound separation device comprises a transformation unit for transforming the physiological signal into a spectrum; an encoder for generating a coded matrix from the spectrum; a Fourier transformer for generating a periodicity coded matrix according to the coded matrix; a latent space cluster for grouping a plurality of clustered coded matrices according to the coded matrix and the periodicity coded matrix; a decoder for generating a plurality of clustered spectrums from the plurality of the clustered coded matrices; and an inverse Fourier transformer for reconstructing a plurality of clustered sound signals from a plurality of the clustered spectrums.

Inventors:
WANG WEI-CHIEN (TW)
Application Number:
PCT/US2019/063480
Publication Date:
June 03, 2021
Filing Date:
November 27, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
VITALCHAINS CORP (US)
International Classes:
A61B7/04; A61B5/08; A61B7/00; G10L19/008; G10L19/02
Domestic Patent References:
WO2019079829A12019-04-25
Foreign References:
US20110249822A12011-10-13
US20170270941A12017-09-21
US20170301354A12017-10-19
Other References:
KONSTANTINOS KAMNITSAS; DANIEL C. CASTRO; LOIC LE FOLGOC; IAN WALKER; RYUTARO TANNO; DANIEL RUECKERT; BEN GLOCKER; ANTONIO CRIMINI: "Semi-Supervised Learning via Compact Latent Space Clustering", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 7 June 2018 (2018-06-07), 201 Olin Library Cornell University Ithaca, NY 14853, XP080888166
POURAZAD M.T., MOUSSAVI Z., FARAHMAND F., WARD R.K.: "Heart Sounds Separation From Lung Sounds Using Independent Component Analysis", ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, 2005. IEEE-EMBS 2005. 27T H ANNUAL INTERNATIONAL CONFERENCE OF THE SHANGHAI, CHINA 01-04 SEPT. 2005, PISCATAWAY, NJ, USA,IEEE, 1 September 2005 (2005-09-01) - 4 September 2005 (2005-09-04), pages 2736 - 2739, XP010908368, ISBN: 978-0-7803-8741-6, DOI: 10.1109/IEMBS.2005.1617037
Attorney, Agent or Firm:
WANG, Dennis (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A system for sound separation, comprising: a sound collector for collecting a physiological signal; and a sound separation device comprising: a transformation unit, configured to receive the physiological signal from the sound collector, for transforming the physiological signal into a spectrum; an encoder, configured to receive the spectrum from the transformation unit, for generating a coded matrix from the spectrum; a Fourier transformer, configured to receive the coded matrix from the encoder, for generating a periodicity coded matrix according to the coded matrix; a latent space cluster, configured to receive the coded matrix from the encoder and the periodicity coded matrix from the Fourier transformer, for grouping a plurality of clustered coded matrix according to the coded matrix and the periodicity coded matrix; a decoder, configured to receive the clustered coded matrix from the latent space cluster, for generating a plurality of clustered spectrums from the plurality of the clustered coded matrix; and an inverse Fourier transformer, configured to receive the clustered spectrums from the decoder, for reconstructing a plurality of clustered sound signals from a plurality of the clustered spectrums, and each of the clustered sound signals is reconstructed from one of the clustered spectrums.

2. The system according to claim 1 , wherein the encoder and the decoder are configured to form a deep autoencoder.

3. The system according to claim 2, wherein the deep autoencoder is implemented with a minimizing-mean-squared-errors (MSE) loss function.

4. The system according to claim 1 , wherein the latent space cluster is a K-means latent space cluster.

5. The system according to claim 1, wherein the sound collector is a stethoscope.

6. The system according to claim 1, wherein the clustered sound signals are originated from different sources at an auscultation site.

7. The system according to claim 1, wherein the clustered sound signals are originated from different sources at different auscultation sites.

8. The system according to claim 1, wherein the physiological signal comprises a heart sound and a lung sound.

9. The system according to claim 8, wherein the plurality of clustered coded matrix comprises a heart sound clustered coded matrix corresponding to the heart sound and a lung sound clustered coded matrix corresponding to the lung sound.

10. The system according to claim 1, wherein the clustered sound signals comprise a clustered heart sound and a clustered lung sound.

11. A sound separation device, comprising: a transformation unit for transforming a physiological signal into a spectrum; an encoder, configured to receive the spectrum from the transformation unit, for generating a coded matrix from the spectrum; a Fourier transformer, configured to receive the coded matrix from the encoder, for generating a periodicity coded matrix according to the coded matrix; a latent space cluster, configured to receive the coded matrix from the encoder and the periodicity coded matrix from the Fourier transformer, for grouping a plurality of clustered coded matrix according to the coded matrix and the periodicity coded matrix; a decoder, configured to receive the clustered coded matrix from the latent space cluster, for generating a plurality of clustered spectrums from the plurality of the clustered coded matrix; and an inverse Fourier transformer, configured to receive the clustered spectrums from the decoder, for reconstructing a plurality of clustered sound signals from a plurality of the clustered spectrums, and each of the clustered sound signals is reconstructed from one of the clustered spectrums.

12. The device according to claim 11, wherein the encoder and the decoder are configured to form a deep autoencoder.

13. The device according to claim 12, wherein the deep autoencoder is implemented with a minimizing-mean-squared-errors (MSE) loss function.

14. The device according to claim 11, wherein the latent space cluster is a K-means latent space cluster.

15. The device according to claim 11, wherein the physiological signal is collected by a stethoscope.

16. The device according to claim 11, wherein the clustered sound signals are originated from different sources at an auscultation site.

17. The device according to claim 11, wherein the clustered sound signals are originated from different sources at different auscultation sites.

18. The device according to claim 11, wherein the physiological signal comprises a heart sound and a lung sound.

19. The device according to claim 18, wherein the plurality of clustered coded matrix comprises a heart sound clustered coded matrix corresponding to the heart sound and a lung sound clustered coded matrix corresponding to the lung sound.

20. The device according to claim 11, wherein the clustered sound signals comprise a clustered heart sound and a clustered lung sound.

21. A method for sound separation, comprising steps of: receiving a physiological signal by a sound collector; transforming the physiological signal received by the sound collector into a spectrum by a transforming unit; generating a coded matrix by an encoder, from the spectrum transformed by the transforming unit; generating a periodicity coded matrix by a Fourier transformer, according to the coded matrix generated by the encoder; grouping a plurality of clustered coded matrix by a latent space cluster, according to the coded matrix generated by the encoder and the periodicity coded matrix generated by the Fourier transformer; generating a plurality of clustered spectrums by a decoder, from the plurality of the clustered coded matrix grouped by the latent space cluster; and reconstructing a plurality of clustered sound signals by an inverse Fourier transformer, from a plurality of the clustered spectrums generated by the decoder, and each of the clustered sound signals is reconstructed from one of the clustered spectrums.

22. The method according to claim 21, wherein the physiological signal is collected by a stethoscope.

23. The method according to claim 21, wherein the clustered sound signals are originated from different sources at an auscultation site.

24. The method according to claim 21, wherein the clustered sound signals are originated from different sources at different auscultation sites.

25. The method according to claim 21, wherein the physiological signal comprises a heart sound and a lung sound.

26. The method according to claim 25, wherein the plurality of clustered coded matrix comprises a heart sound clustered coded matrix corresponding to the heart sound and a lung sound clustered coded matrix corresponding to the lung sound.

27. The method according to claim 21, wherein the clustered sound signals comprise a clustered heart sound and a clustered lung sound.

28. A method for sound separation, comprising steps of: generating a coded matrix by an encoder, from a spectrum of a physiological signal; generating a periodicity coded matrix by a Fourier transformer, according to the coded matrix generated by the encoder; grouping a plurality of clustered coded matrix by a latent space cluster, according to the coded matrix generated by the encoder and the periodicity coded matrix generated by the Fourier transformer; generating a plurality of clustered spectrums by a decoder, from the plurality of the clustered coded matrix grouped by the latent space cluster; and reconstructing a plurality of clustered sound signals by an inverse Fourier transformer, from a plurality of the clustered spectrums generated by the decoder, and each of the clustered sound signals is reconstructed from one of the clustered spectrums.

29. The method according to claim 28, wherein the physiological signal is collected by a stethoscope.

30. The method according to claim 28, wherein the clustered sound signals are originated from different sources at an auscultation site.

31. The method according to claim 28, wherein the clustered sound signals are originated from different sources at different auscultation sites.

32. The method according to claim 28, wherein the physiological signal comprises a heart sound and a lung sound.

33. The method according to claim 32, wherein the plurality of clustered coded matrix comprises a heart sound clustered coded matrix corresponding to the heart sound and a lung sound clustered coded matrix corresponding to the lung sound.

34. The method according to claim 28, wherein the clustered sound signals comprise a clustered heart sound and a clustered lung sound.

Description:
METHOD AND SYSTEM FOR SEPARATION OF SOUNDS FROM DIFFERENT SOURCES

FIELD

[0001] The present disclosure is generally related to the method, module and system for separation of sounds from different sources. More particularly, the present disclosure is directed to a method, module and system for analysis of heart sounds and lung sounds.

BACKGROUND

[0002] Auscultation is an important tool for analyzing and monitoring heart, lung, bowel, and vascular disorders in a human body. A stethoscope is often used by a physician to perform auscultation.

[0003] The physician may use chest auscultation to analyze, monitor, or diagnose various disorders in the cardiovascular system and respiratory system. The lung, the heart, and the thoracic aorta are located in the thoracic cavity. The lung sound is generated from exhale and inhale, and the heart sound is generated from diastole and systole in a cardiac cycle. Due to the proximity of lung and heart, auscultation sound from the chest auscultation is often a mixture of lung sound and heart sound.

[0004] Because the proximity of their source, the heart sound and lung sound may be noises or interferences in the chest auscultation. When performing auscultation of the heart, the presence of lung sound is a noise to the physician; when performing auscultation of the lung, the presence of heart sound is also a noise. These noises or interferences could be amplified by an earpiece of the stethoscope when performing auscultation.

[0005] Beside the interferences caused by the lung sound and heart sound, auscultation of body parts other than the chest could also be hindered by another source in the human body that is not the subject of auscultation. Such as the auscultation of fetus heart could be interfered by the heart sound or the lung sound of the mother, or the auscultation of mother heart could be interfered by the heart sound of the fetus.

[0006] To manage noises or interferences in auscultation, the physician is trained to purposely identify sources of the interferences. However, the complexity and the nature of interferences can be overwhelming for the physicians, therefore misinterpretation or missed detection of auscultation sounds may occur due to the interferences.

[0007] Improved stethoscope technology is another approach to manage interferences in auscultation. Digital stethoscopes including analog-digital signal converters or computers may reduce or filter the interferences in auscultation. US20180317876A1 discloses a digital stethoscope that includes a noise reduction system for removing heart sound in lung auscultation. However, both heart sound and lung sound contain important information for the diagnosis, analysis, or monitoring for the heart and the lung. If the heart sound is removed by a noise reduction system and a heart auscultation is later required on the same patient, the physician would need to conduct another chest auscultation to retrieve previously discarded heart sounds. This would a burden for both the physician and the patient.

[0008] US20180125444A1 discloses an auscultation device for detecting a cardiac and/or respiratory disease, and accelerometer signals are used to monitor the inhalation- exhalation cycle of the lung to identify the lung sound. However, additional accelerometers increase the weight of said device, thus the device would be difficult to be carried by the physician.

[0009] To manage the interferences in auscultation, there is a need for separating sounds from different sources in auscultation. The interferences can be the lung sound or the heart sound in chest auscultation, the heart sound or the lung sound of the mother in fetus heart auscultation, the heart sound of the fetus in heart auscultation of the mother, or other interferences from source that is not the subject of auscultation.

[0010] There is also a need for an improved digital auscultation system for separating sounds from different sources and reconstructing said sounds from different sources. The reconstructed sounds are therefore rendered as valuable information in auscultation.

SUMMARY OF THE INVENTION

[0011] It is also an object of the present disclosure to provide machine learning methods and systems for separating sounds from difference sources in auscultation.

[0012] It is an object of the present disclosure to provide methods and systems based on non-supervised learning for separating sounds from different sources in auscultation. [0013] It is also an object of the present disclosure to provide methods and systems based on auto-encoder for separating sounds with different periodicities.

[0014] An embodiment of the present disclosure provides a system for sound separation. The system comprises a sound collector for collecting a physiological signal and a sound separation device. The sound separation device comprises: a transformation unit, configured to receive the physiological signal from the sound collector, for transforming the physiological signal into a spectrum; an encoder, configured to receive the spectrum from the transformation unit, for generating a coded matrix from the spectrum; a Fourier transformer, configured to receive the coded matrix from the encoder, for generating a periodicity coded matrix according to the coded matrix; a latent space cluster, configured to receive the coded matrix from the encoder and the periodicity coded matrix from the Fourier transformer, for grouping a plurality of clustered coded matrices according to the coded matrix and the periodicity coded matrix; a decoder, configured to receive the clustered coded matrix from the latent space cluster, for generating a plurality of clustered spectrums from the plurality of the clustered coded matrices; and an inverse Fourier transformer, configured to receive the clustered spectrums from the decoder, for reconstructing a plurality of clustered sound signals from a plurality of the clustered spectrums.

[0015] In a preferred embodiment, the encoder and the decoder are configured to form a deep autoencoder.

[0016] In a preferred embodiment, the deep autoencoder is implemented with a minimizing-mean-squared-errors (MSE) loss function.

[0017] In a preferred embodiment, the latent space cluster is a K-means latent space cluster.

[0018] In a preferred embodiment, the sound collector is a stethoscope.

[0019] In a preferred embodiment, the clustered sound signals are originated from different sources at an auscultation site.

[0020] In a preferred embodiment, the clustered sound signals are originated from different sources at different auscultation sites.

[0021] In a preferred embodiment, the physiological signal comprises a heart sound and a lung sound.

[0022] In a preferred embodiment, the plurality of clustered coded matrices comprises a heart sound clustered coded matrix corresponding to the heart sound and a lung sound clustered coded matrix corresponding to the lung sound.

[0023] In a preferred embodiment, the clustered sound signals comprise a clustered heart sound and a clustered lung sound.

[0024] In a preferred embodiment, each of the clustered sound signals is reconstructed from one of the clustered spectrums.

[0025] An embodiment of the present disclosure provides a sound separation device. The sound separation device comprises: a transformation unit for transforming a physiological signal into a spectrum; an encoder, configured to receive the spectrum from the transformation unit, for generating a coded matrix from the spectrum; a Fourier transformer, configured to receive the coded matrix from the encoder, for generating a periodicity coded matrix according to the coded matrix; a latent space cluster, configured to receive the coded matrix from the encoder and the periodicity coded matrix from the Fourier transformer, for grouping a plurality of clustered coded matrices according to the coded matrix and the periodicity coded matrix; a decoder, configured to receive the clustered coded matrix from the latent space cluster, for generating a plurality of clustered spectrums from the plurality of the clustered coded matrices; and an inverse Fourier transformer, configured to receive the clustered spectrums from the decoder, for reconstructing a plurality of clustered sound signals from a plurality of the clustered spectrum.

[0026] An embodiment of the present disclosure provides a method for sound separation. The method comprises steps of: receiving a physiological signal by a sound collector; transforming the physiological signal received by the sound collector into a spectrum by a transforming unit; generating a coded matrix by an encoder, from the spectrum transformed by the transforming unit; generating a periodicity coded matrix by a Fourier transformer, according to the coded matrix generated by the encoder; grouping a plurality of clustered coded matrices by a latent space cluster, according to the coded matrix generated by the encoder and the periodicity coded matrix generated by the Fourier transformer; generating a plurality of clustered spectrums by a decoder, from the plurality of the clustered coded matrices grouped by the latent space cluster; and reconstructing a plurality of clustered sound signals by an inverse Fourier transformer, from a plurality of the clustered spectrums generated by the decoder.

[0027] In a preferred embodiment, the physiological signal is collected by a stethoscope. [0028] An embodiment of the present disclosure provides another method for sound separation. The method comprises steps of: generating a coded matrix by an encoder, from a spectrum of a physiological signal; generating a periodicity coded matrix by a Fourier transformer, according to the coded matrix generated by the encoder; grouping a plurality of clustered coded matrices by a latent space cluster, according to the coded matrix generated by the encoder and the periodicity coded matrix generated by the Fourier transformer; generating a plurality of clustered spectrums by a decoder, from the plurality of the clustered coded matrices grouped by the latent space cluster; and reconstructing a plurality of clustered sound signals by an inverse Fourier transformer, from a plurality of the clustered spectrums generated by the decoder.

BRIEF DESCRIPTION OF THE DRAWINGS [0029] Implementations of the present technology will now be described, by way of examples only, with reference to the attached figures.

[0030] FIG. 1 is a schematic illustration of a system for sound separation, in accordance with an embodiment of the present disclosure.

[0031] FIG. 2 is schematic diagram of a sound collector and a sound separation device, in accordance with an embodiment of the present disclosure.

[0032] FIG. 3 is an architecture of a deep autoencoder, in accordance with an embodiment of the present disclosure.

[0033] FIG. 4 is a schematic diagram of a sound separation process within the sound separation device, in accordance with an embodiment of the present disclosure.

[0034] FIG. 5 is a schematic diagram of some of the steps in FIG. 4, in accordance with an embodiment of the present disclosure.

[0035] FIG. 6A is a schematic representation of a periodicity coded matrix, and FIG. 6B and 6C are schematic representations of coded matrixes, in accordance with an embodiment of the present disclosure.

[0036] FIG. 7A is a representation of a clustered spectrum, and FIG. 7B is a representation of another clustered spectrum, in accordance with an embodiment of the present disclosure.

[0037] FIG. 8A is a representation of a clustered sound signal, and FIG. 8B is a representation of another clustered sound signal, in accordance with an embodiment of the present disclosure.

[0038] FIG. 9 is a schematic illustration of a system for sound separation, in accordance with an embodiment of the present disclosure.

[0039] FIG. 10 is a schematic illustration of a system for sound separation, in accordance with an embodiment of the present disclosure.

[0040] FIG. 11 is a schematic illustration of a system for sound separation, in accordance with an embodiment of the present disclosure. DETAILED DESCRIPTION

[0041] It will be noted at the beginning that for simplicity and clarity of illustration, where appropriate, reference numerals have been repeated among the different figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein can be practiced without these specific details. In other instances, methods, procedures and components have not been described in detail so as not to obscure the related relevant feature being described. The drawings are not necessarily to scale and the proportions of certain parts may be exaggerated to better illustrate details and features. The description is not to be considered as limiting the scope of the embodiments described herein.

[0042] Several definitions that apply throughout this disclosure will now be presented. [0043] The term “coupled” is defined as connected, whether directly or indirectly through intervening components, and is not necessarily limited to physical connections. The connection can be such that the objects are permanently connected or releasably connected. The term “comprising” means “including, but not necessarily limited to”, it specifically indicates open-ended inclusion or membership in the so-described combination, group, series and the like.

[0044] FIG. 1 is a schematic illustration of a system in accordance with an embodiment of the present disclosure. A system 10 comprises a sound collector 11 and a sound separation device 12. The sound collector 11 collects sound signals, or physiological sounds in a human body, and can be a conventional stethoscope of a digital stethoscope. The sound collector 11 can be placed directly on the chest, the abdomen, or other auscultation sites to perform an auscultation. When the sound collector 11 is placed on the chest, the purposes of the auscultation can be analyzing, monitoring, or diagnosing various disorders in the cardiovascular system or the respiratory system. When the sound collector 11 is placed on the abdomen of a pregnant woman, the purposes of the auscultation can be analyzing or monitoring the cardiovascular system of a fetus. The sound collector 11 transmits the sound signals to the sound separation device 12 by wireless communication or cable.

[0045] The sound separation device 12 can receive the sound signals from the sound collector 11. Each of the sound signals from the sound collector 11 may comprise at least two sound signals originated from different sources. The sources of the sound signals can be a lung, a heart, or a thoracic aorta when the sound collector 11 is placed on the chest. The sources of the sound signals can be the heart of the fetus, the heart of the mother, or the lung of the mother when the sound collector 11 is placed on the abdomen of the pregnant woman.

[0046] FIG. 2 is a schematic diagram of the sound collector 11 and the sound separation device 12, in accordance with an embodiment of the present disclosure. The sound separation device 12 may comprise at least one processor 127 to analyze or separate the sound signals from different sources. The processor can be a CPU, an MPU, or other components that performs computation. The sound separation device 12 can be a mobile device, a personal computer, or a server.

[0047] The processor of the sound separation device 12 comprising or is coupled to a computer-readable medium. A non-transitory computer program product is embodied in the computer-readable medium. The non-transitory computer program product embodied in the computer-readable medium can be executed by the processor 127 of the sound separation device 12, to separate the sound signals from the different sources.

[0048] To perform sound separation, the sound separation device 12 comprises a transformation unit 121, an encoder 122, a Fourier transformer 123, a latent space cluster 124, a decoder 125, and an inverse Fourier transformer 126. Each of the above elements can be constituted as a part of the processor, or can be embedded into the non-transitory computer program product. The transformation unit 121 of the sound separation device 12 receives the sound signals from the sound collector 11, transforms the signals into spectrums, and the spectrums are received by the encoder 122. The mechanism and process required for the sound separation is described below.

1. The Mechanism of a Deep Autoencoder

[0049] The encoder 122 and the decoder 125 of the sound separation device 12 are configured to form a deep autoencoder (DAE). The encoder 122 encodes an input x n to a latent space, and the decoder 125 then attempts to reconstruct the input by decoding from the latent space. The reconstructed output y n is aimed to approximate X by minimizing the mean squared errors (MSE).

[0050] The following equations illustrate a DAE mechanism:

[0051] According to the equation provided above, w E and w D are encoding and decoding matrixes, b E and b D are vectors of biases, and l n 6 R Mxl , wherein M is an amount of neurons in the latent space.

[0052] The DAE can be constructed with different architectures, including fully connected layer, convolutional layer, and de-convolutional layer. An objective of the autoencoder is to learn a representation (encoding) for a set of data. An embodiment of the present disclosure analyzes the periodicity of latent representation, classifies, and groups the latent representation corresponding to the original set of data. Therefore, different DAE architectures have different expressiveness for modeling sound signals could be implemented by various embodiments of the present disclosure for different applications. In one of the embodiments of the present disclosure, a deep convolutional autoencoder (DCAE) can be adopted in the DAE for its superiority in dismantling and restoration. FIG. 3 is an architecture of the DAE, in accordance with an embodiment of the present disclosure. In FIG, 3, the DCAE is constructed with several convolutional layers and deconvolutional layers. The convolutional layers extract features of the input and the deconvolutional layers reconstruct approximate sound signals from the feature represented by the convolutional layers. The convolutional layers connect multiple input activations within a filter to form a single activation. In FIG. 3, layer 128a can be a Conv2D(4xl) 32 Unit, layer 128b can be a Conv2D(4xl) 16 Unit, layer 128c can be a Conv2D(3xl) 8 Unit, and layer 128d can be a latent space. The deconvolutional layer is a contrast of the convolutional layer, and the deconvolutional layer associates the single input activation with multiple outputs. Layer 128e can be a DeConv2D(3xl) 8 Unit, layer 128f can be a DevCon2D(3xl) 16 Unit, and layer 128g can be a DeConv2D(4xl) 32 Unit. The types of autoencoder, the structure of layers, the number of units in each layer, the number of neurons, and other details in the configuration of FIG. 3 are illustrative only. One of ordinary skills in the art would understand the flexibility of autoencoder configurations without degrading signal separation, classification, or purification, and preventing under-fitting or over-fitting.

[0053] The following equations illustrate the convolutional layer and the deconvolutional layer: w * x = W x' w * T x = W T x'

[0054] According to the equations provided above, w is defined as a kernel and x is defined as input.

2. The Process of the Sound Separation

[0055] FIG. 4 is a schematic diagram of a sound separation process within the sound separation device 12, in accordance with an embodiment of the present disclosure. The proposed periodicity-coded deep auto-encoder (PC-DAE) in the sound separation device 12 is able to solve blind source separation (BSS) problems. The BSS problem is the separation of signals from different sources within a mixed signal, without any prior knowledge of the sources or how the signals are mixed. In the present disclosure, the BSS problem can be the separation of a heart sound and a lung sound from the sound signal of a chest auscultation, because the lung sound and the heart sound are collected at one auscultation site, and overlapped on a frequency range of 50 Hz to 150 Hz. The BSS problem of the present disclosure can also be the separation of a fetal heart sound and a maternal heart sound/lung sound from the sound signal of a fetal auscultation.

[0056] The physiological signal is a mixture of at least two sound signals, each of the sound signals are originated from different sources. The physiological signal can be in a digital format, and the digital format may be converted from an analog sound signal. In one embodiment, each of the physiological signals is collected from the auscultation site and originated from at least two sources. The source of the physiological signals can be a lung or a heart when the physiological signals are collected by the sound collector 11 from the chest, therefore the physiological signal from this auscultation site at least comprises the lung sound and the heart sound. In one embodiment, the source of the physiological signals can also be the heart of the fetus, the heart of the mother, or the lung of the mother when the physiological signals are collected by the sound collector 11 from the abdomen of the pregnant woman, therefore the physiological signal from this auscultation site at least comprises the fetal heart sound and the maternal heart/lung sound. In one embodiment, the physiological signals may also be stored in the database, wherein the location of sound collection is already identified within each of the physiological signals.

[0057] Additionally, the physiological signals may obtain separately from different auscultation sites. Various auscultation site localizing methods can be implemented with an embodiment of the present disclosure, such as the auscultation site locating methods described in U.S. Patent Publication No. US20160143512A1.

[0058] In Step SI of FIG. 4, the sound separation device 12 receives the physiological signals. The physiological signals can be from the sound collector 11 or the database. In one embodiment of the present disclosure, the physiological signal includes 2 sound signals originated from 2 sources, namely a 1 st sound signal originated from a 1 st source and a 2 nd sound signal originated from a 2 nd source. In another embodiment of the present disclosure, the physiological signal may include more than 2 sound signals originated from more than 2 sources.

[0059] In Step S2 of FIG. 4, the transformation unit 121 of the sound separation device 12 transforms the physiological signals into a spectrum by Short Time Fourier Transformation (STFT), Fast Fourier Transformation (FFT), or Discrete Fourier Transformation (DFT). The spectrum can be logarithmized into log power spectrum (LPS). In an embodiment of the present disclosure, the LPS is modeled by the DAE, and the DAE decreases the mean square error of reconstructed loss by using back-propagation algorithms. In an embodiment of the present disclosure, the transformation unit 121 is in communication with the sound collector 11 and configured to receive the sound signals collected by the sound collector 11.

[0060] In Step S3 of FIG. 4, a coded matrix is generated from the spectrum of Step S2 by the encoder 122 in the sound separation device 12. The coded matrix exhibits significant expressiveness on temporal information. In an embodiment of the present disclosure, the encoder 122 is in communication with the transformation unit 121 and configured to receive the spectrum generated in Step S2 by the transformation unit 121. The encoder 122 is configured to correspond to the decoder 125 according to the general autoencoder structure described in FIG. 3.

[0061] In Step S4 of FIG. 4, the coded matrix generated in Step S3 is further transformed to a periodicity coded matrix via DFT for analyzing the temporal information. In an embodiment of the present disclosure, the transformation of the periodicity coded matrix is performed by the Fourier transformer 123, the Fourier transformer 123 is in communication with the encoder 122 and configured to receive the coded matrix generated in Step S3 by the encoder 122.

[0062] In Step S5 of FIG. 4, a latent space cluster 124 is used for grouping a plurality of clustered coded matrices. In an embodiment of the present disclosure, the latent space cluster 124 is in communication with the encoder 122 and the Fourier transformer 123, and is configured to receive the coded matrix generated in Step S3 by the encoder 122 and the periodicity coded matrix generated in Step S4 by the Fourier transformer 123. The technical principles of signal processing in Step S4 and Step S5 are described below. In the embodiment wherein the physiological signal includes 2 different sound signals, different neurons of the periodicity analysis algorithm are activated by the 1 st sound signal or the 2 nd sound signal in the latent space. Therefore, after the DAE models a mixture of the 1 st sound signal and the 2 nd sound signal, the log power spectrum sequence is inputted to the encoder 122 to obtain latent representations at each time step. The latent representation (latent space sequences) then concatenates to a coded matrix L, and L 6 R MxN The coded matrix L is represented as Z mix comprising the latent representations, as described below:

[0063] According to the equation provided above, each of the coded matrix element z™ ic is the row of L, and the coded matrix element z ; mi serves as time factor. Zj nix = [lj l jN ], 1 < j <M, l R 1xN , M is the amount of the neurons in the latent space, and N is the duration.

[0064] In Step S4 of FIG. 4, the periodicity coded matrix is generated by a Fourier transformer 123 in the sound separation device 12 for further temporal information analysis. When different sound signals comprising different periodicity characteristic are mixed, the sound signals could be separated from each other by identifying different periodicity characteristic, thus blind source separation (BSS) problems are solved. In one embodiment, a sound collector may collect a physiological signal with lung sound, heart sound, thermal noise, and other noises generated during the chest auscultation. Because the lung sound and the heart sound exhibit different periodicity characteristics, they could be identified in S5 of FIG.4 by clustering methods. Therefore, to facilitate periodicity analysis, the periodicity coded matrix is generated by the Fourier transformer 123 via Discrete Fourier Transformation (DFT).

[0065] Numerically speaking, the coded matrix Z"" transformed into the periodicity coded matrix P = [p 1 , p j , ... ] by the Fourier transformer 123 via DFT is described as: P j = \DFT{z x )\

[0066] According to the equation provided above, pi 1 £ j £ M . The frame size of DFT is setting equal to the N.

[0067] In Step S5 of FIG. 4, a latent space cluster 124 is used for grouping the plurality of clustered coded matrices. In one embodiment, the latent space cluster 124 is implemented with sparse NMF clustering method for analyzing the periodicity coded matrix. The sparse NMF clustering method can be described as:

H p = argmm[

[0068] According to the equation provided above, an approximation P T by the NMF is achieved by minimizing the error function. W P provides cluster centroids, and H p =

[h l ... ] provides a cluster membership, wherein h j 6 R kx( \ ~ \ +1) and k is set to an amount of the cluster basis and 1 < j £ M. The grouped cluster of p j is determine by the largest h among h j ’s index; A represents a sparsity penalty factor; || | | represents Ll- norm; ||-|| 2 represents a distance measurement.

[0069] After the sparse NMF clustering, the clustering result of periodicity matrix p is assigned to the coded matrix. The coded matrixes that shared similar periodicity are assigned as clustered coded matrixes.

[0070] The periodicity analysis algorithm used in S5 can be described as the following:

Periodicity Analysis Algorithm: Periodicity-coded analysis for coded matrix separation

Input: the coded matrix Z mix

Output: 1 st coded matrix Z lst , 2 nd coded matrix Z 2nd set M to the column dimension of the Z mix end

Cluster P by sparse NMF, and get the label H p and the corresponding to Z mix set source to {1st, 2nd) foreach source set z source to Z mix then do set Z j SOUrce to min element(Z j 0urce ) end end return Z source end

[0071] The clustering algorithm, the structure of matrix factorization, and other details described above are illustrative only. One of ordinary skills in the art would understand that some other clustering methods could be implemented in an embodiment of the present disclosure, to balance performance, entropy, amount of the cluster, iteration, or processing time. The clustering method may include K-means clustering, NMF clustering, or pre-trained supervised clustering.

[0072] In an embodiment of the present disclosure, in order to obtain better intelligibility of each of the sound signals, a mask is constructed for masking interferences of each source. In other words, one embodiment of the present disclosure could be generating specific clustered spectrums while masking other sounds. These “other sounds” are regarded as interferences, in view of desired sound sources corresponding to specific clustered spectrums. Therefore, an embodiment of the present disclosure could be generating clustered spectrums of lung sounds while masking heart sounds and other noises, and another embodiment could be generating clustered spectrums of heart sounds while masking lung sounds and other noises.

[0073] The mask is defined as the following:

[0074] According to the above, in the embodiment wherein the spectrum is LPS, source^(A, /) denotes each source of LPS, in particular A-F unit. 1 < k < K, wherein^ is the amount of the cluster. [0075] In Step S6 of FIG. 4, the clustered coded matrix is decoded by the decoder 125 and are constructed into a plurality of clustered spectrums. In an embodiment of the present disclosure, the decoder 125 is in communication with the latent space cluster 124, and configured to receive the clustered coded matrix grouped in Step S5 by the latent space cluster 124. In the embodiment wherein the physiological signal includes 2 sound signals, two cluster is presented. Therefore, after the clustered coded matrixes are grouped by the periodicity analysis algorithm, the separated source is decoded as equations below. A first mask M lst (A, /) and a second mask M 2n(i (A, /) are constructed after the clustered coded matrix and their corresponding spectrums/LPS are obtained.

[0076] The first LPS Y lst and second LPS Y 2nd are generated from the mixed LPS while applying masks for specific sounds.

Y ist = M ist (A, * Y ?2 nd = M2hά ( A b * g

[0077] In Step S7 of FIG. 4, the clustered spectrums are transformed into a plurality of clustered sound signals by the inverse Fourier transformer 126 through Inverse Short Time Fourier Transformation (ISTFT) or Inverse Discrete Fourier Transformation. The clustered sound signals are reconstructed, and each of the clustered sound signals is reconstructed from one of the clustered spectrums for a specific sound source. In an embodiment of the present disclosure, the inverse Fourier transformer 126 is in communication with the decoder 125, and configured to receive the clustered spectrums decoded in Step S6 by the decoder 125.

[0078] Each of the clustered sound signals reconstructed in Step S7 is originated from a source. If the physiological signal received in SI is collected from one auscultation site, then the clustered sound signals may be originated from different sources at the auscultation site: if the physiological signal is collected from a location of the chest, then one of the clustered sound signals is originated from the lung, and another of the clustered sound signals is originated from the heart. Additionally, if the physiological signal received in SI is collected different auscultation sites, then the clustered sound signals may be originated from different sources: if the physiological signals are collected separately from 2 or more auscultation sites of the chest, then one of the clustered sound signal is originated from the lung, and another of the clustered sound signals is originated from the heart.

[0079] FIG. 5 is a schematic diagram of Step S4, S5, and S6 in FIG. 3, in accordance with an embodiment of the present disclosure. A periodicity coded matrix 41 is generated by the Fourier transformer 123 via DFT, clustered coded matrixes 51 and 52 are grouped by the latent space cluster 124, and clustered spectrums 61 and 62 are generated from the clustered coded matrixes 51 and 52.

[0080] FIG. 6A is a schematic representation of the periodicity coded matrix 41, in accordance with an embodiment of the present disclosure. The periodicity coded matrix 41 is presented with a y-axis of codes, and an x-axis of DFT frames representing time. The periodicity coded matrix 41 comprises at least 2 periodicities, wherein the features that periodically occurred are illustrated with different shades of gray. The features that constitute the periodicity can be the amplitude or the waveform of the sound signal.

[0081] FIG. 6B and 6C are schematic representations of the coded matrix, in accordance with an embodiment of the present disclosure. A first coded matrix 411 and a second coded matrix 412 is presented as having different periodicities. Each of the first coded matrix 411 and the second coded matrix 412 are with different periodicities.

[0082] The coded matrixes 411 and 412 are to be taken together with Step S5 of FIG. 5. In Step S5 of FIG. 5, the coded matrixes 411 and 412 are grouped by the latent space, and the clustered coded matrixes 51 and 52 are generated. The clustered coded matrix 51 is corresponded to the coded matrix 411, and the clustered coded matrix 52 is corresponded to the coded matrix 412.

[0083] In Step S6 of FIG. 5, the clustered coded matrix 51 is decoded to generate the clustered spectrum 61, and clustered coded matrix 52 is decoded to generate the clustered spectrum 62. FIG. 7A is a representation of the clustered spectrum 61, and FIG. 7B is a representation of the clustered spectrum 62, in accordance with an embodiment of the present disclosure.

[0084] The clustered spectrum 61 and 62 are transformed into a plurality of clustered sound signals by the inverse Fourier transformer 126 via ISTFT or IDFT. FIG. 8A and 8B are representations of the clustered sound signals, in accordance with an embodiment of the present disclosure. FIG. 8A and 8B are amplitude vs. time graph, wherein a y-axis is an amplitude of the signal, and an x-axis represents time stamps of the signal.

3. The Separation of Heart Sound and Lung Sound

[0085] In the following section, the heart sound and the lung sound in mixed heart-lung sounds collected by an auscultation system are separated by an embodiment of the present disclosure. The auscultation is performed on a SAM student auscultation manikin (SAM® 3G, Cardionics), wherein the SAM manikin comprises a standard sound library recorded from patients. A sound dataset is constructed from the standard sound library, and at least comprises a plurality of mixtures of the heart sounds and the lung sounds, wherein the heart sounds comprise normal heart sounds with 2 beating speeds and the lung sounds comprise normal, wheezing, rhonchi, and stridor. The mixed heart-lung sounds are generated in terms of signal to noise ratio (SNR), and the SNR = {-6dB, -2dB, OdB, 2dB, 6dB}. The mixed heart-lung sounds are broadcasted from the SAM manikin and collected by an electronic stethoscope (iMEDIPLUS). All of the collected sounds are sampled at 8k Hz, with a DFT frame length of 2048 samples, a DFT frame shift of 128 samples, respectively.

[0086] The DAE model used in the embodiment is consisted of 7 hidden layers, and each of the layer has 1024, 512, 256, 128, 256, 512, andl024 neurons respectively. The encoder in the DCAE model used in the embodiment is consisted of 3 convolutional layers: a 1 st layer having 32 filters with the kernel size of of 1x4, a 2 nd layer having 16 filters with the kernel size of 1x3 and a 3 rd layer having 8 filters with the kernel size of 1 x 3. The decoder in the DCAE model used in the embodiment is consisted of 3 deconvolutional layers: the 1 st layer having 8 deconvolutional filters with the kernel size of 1x3, the 2 nd layer having 16 deconvolutional filter with the kernel size of 1x3, and the 3 rd layer having 32 deconvolutional filter with the kernel size of 1 X4. Both the DAE and the DCAE models set an activation function of the encoder as a rectified linear unit, the activation function of the decoder as a hyperbolic tangent. The optimizers of both the DAE and the DCAE models are set to adam optimizer. An unsupervised NMF-based is taken as a baseline of performance benchmark, and a basis number of NAIF is set to 50. A L2 cost function is chosen for simulation.

[0087] The DAE model, and the NMF model are implemented with the periodicity analysis algorithms previously described, and thus PC-NMF, PC-DAE, and PC-DCAE models are generated from the above combinations. The PC-NMF, PC-DAE, and PC- DCAE are performed in a personal computer with the sound separation process illustrated and described in FIG. 4, with NMF as a baseline of performance benchmark.

3.1 Evaluation of the separated sound signals by SDR, SIR, and SAR scores [0088] The signal quality of the separated sound signals is evaluated by a signal distortion ratio (SDR) and a signal to interferences ratio (SIR). The SDR indicates similarity (distortion) between the original SAM® signal and the reconstructed signal. The SIR indicates a signal separation clarity, and the SIR value is higher when signals from different sources in the mixed signal are not interfering with each other. Table 1 and Table 2 illustrate the SDR and SIR evaluations for the NMF, PC-NMF, DCAE, PC-DAE, and PC-DCAE models in the separated heart sounds and the separated lung sounds. [0089] Table 1 : Evaluation for separated heart sounds

[0090] Table 2: Evaluation for separated lung sounds

[0091] According to Table 1 and Table 2, the SDRs in the models implemented with the periodicity analysis algorithms (or periodicity coded models, PC-models), have higher performances, meaning the heart sounds and the lung sounds are better reconstructed in the PC-models than the models without periodicity analysis algorithms [0092] For the separated heart sound in Table 1, a comparison between the NMF and the PC-NMF in OdB, 2dB, and 6dB shows the periodicity coded models have improved signal recovery qualities. In OdB, the SDR has increased from 2.31 in the NMF to 3.91 in the PC-NMF; in 2dB, the SDR has increased from 2.26 in the NMF to 5.39 in the PC- NMF; in 6dB, the SDR has increased from 4.57 in the NMF to 7.60 in the PC-NMF. Additionally, a comparison between the DCAE and the PC-DCAE shows the periodicity coded models have significantly improved the signal recovery qualities in the separated heart sounds. In -6dB, the SDR has increased from -0.33 in the DCAE to 3.61 in the PC- DCAE; in -2dB the SDR has increased from 4.81 in the DCAE to 7.22 in the PC-DCAE; in OdB, the SDR has increased from 5.77 in the DCAE to 9.15 in the PC-DCAE; in 2dB, the SDR has increased from 6.87 in the DCAE to 10.46 in the PC-DCAE; in 6dB, the SDR has increased from 9.63 in the DCAE to 16.14 in the PC-DCAE. The results in Table 1 indicate the periodicity analysis algorithms have improved the recovery quality in the separated heart sounds, this suggests the PC-models can be used in a severer auscultation environment. Comparing with the DAE algorithm, the DCAE algorithm is more compatible with the PC-model, therefore the PC-DCAE has better performance than the PC-DAE.

[0093] For the separated lung sounds in Table 2, a comparison between the NMF and the PC-NMF in -6dB, -2dB, and OdB shows the models implemented with the periodicity analysis algorithms, or the periodicity coded models, have improved signal recovery qualities. In -6dB, the SDR has increased from -3.71 in the NMF to -1.48 in the PC-NMF; in -2dB, the SDR has increased from -0.47 in the NMF to -0.41 in the PC-NMF; in OdB, the SDR has increased from 1.30 in the NMF to 1.58 in the PC-NMF. Additionally, a , comparison between the DCAE and the PC-DCAE shows that the periodicity coded models have significantly improved the signal recovery qualities in the separated lung sounds. In -6dB, the SDR has increased from -0.93 in the DCAE to 4.82 in the PC-DCAE; in -2dB, the SDR has increased from 2.94 in the DCAE to 7.78 in the PC-DCAE; in OdB, the SDR has increased from 5.15 in the DCAE to 9.36 in the PC-DCAE; in 2dB, the SDR has increased from 7.31 in the DCAE to 10.20 in the PC-DCAE; in 6dB, the SDR has increased from 9.14 in the DCAE to 11.53 in the PC-DCAE. The results in Table 2 indicates the periodicity analysis algorithms have improved the recovery quality in the separated lung sounds, this suggests the PC-models can be used in a severer auscultation environment.

[0094] From the SDRs in Table 1 and Table 2, it is evident that the SDR in the periodicity coded models have better performance, suggesting the PC-models have improved the signal recovery qualities in all aspects. Specifically, the SDR improvements between the PC-DCAE and the DCAE are more significant than the improvements between the PC-NMF and the NMF, therefore the DCAE is an ideal model for implementing the periodicity analysis algorithms.

[0095] According to Table 1 and Table 2, the SIRs in the PC-models have higher performances, meaning the heart sounds and the lung sounds are more clearly separated without interfering with each other.

[0096] For the separated heart sounds in Table 1, a comparison between the DCAE and the PC-DCAE shows the periodicity coded models have significantly improved signal separation clarities. In -6dB, the SIR has increased from 0.66 in the DCAE to 5.01 in the PC-DCAE; in -2dB, the SIR has increased from 6.61 in the DCAE to 8.98 in the PC- DCAE; in OdB, the SIR has increased from 7.17 in the DCAE to 11.29 in the PC-DCAE; in 2dB, the SIR has increased from 8.89 in the DCAE to 12.59 in the PC-DCAE; in 6dB, the SIR has increased froml l.93 in the DCAE to 16.14 in the PC-DCAE. However, a comparison of the NMF and the PC-NMF in Table 1 shows the periodicity coded models may have undesired influences in the heart sound separation clarity. Table 1 indicates that implementing the periodicity analysis algorithms improve the clarity of the heart sounds separation with the DCAE model, rather than with the NMF model.

[0097] For the separated lung sounds in Table 2, a comparison between the NMF and the PC-NMF shows periodicity coded models have improved the signal separation clarity. In -6dB, the SIR has increased from- 1.16 in the NMF to 3.89 in the PC-NMF; in -2dB, the SIR has increased from 1.86 in the NMF to 3.63 in the PC-NMF; in OdB, the SIR has increased from 3.96 in the NMF to 6.21 in the PC-NMF; in 2dB, the SIR has increased from 7.76 in the NMF to 8.86 in the PC-NMF; in 6dB, the SIR has increased from 12.28 in the NMF to 13.55 in the PC-NMF. Additionally, a comparison between the DCAE and the PC-DCAE shows periodicity coded models have significantly improved the signal separation clarity in the separated lung sounds. In -6dB, the SIR has increased from -0.13 in the DCAE to 6.09 in the PC-DCAE; in -2dB, the SIR has increased from 3.86 in the DCAE to 9.83 in the PC-DCAE; in OdB, the SIR has increased from 6.54 in the DCAE to 11.71 in the PC-DCAE; in 2dB, the SIR has increased from 9.31 in the DCAE to 13.07 in the PC-DCAE; in 6dB, the SIR has increased from 11.32 in the DCAE to 16.04 in the PC-DCAE. Table 2 indicates that the periodicity analysis algorithms have significantly improved the lung sound separation clarity, and the periodicity analysis algorithms are more compatible with the DCAE models, than with the NMF models.

[0098] From the SIRs in Table 1 and Table 2, it is evident that the SIR in the periodicity coded models have better performances, suggesting the periodicity analysis algorithms have improved the signal separation clarity when the algorithm is implemented with the DCAE model.

[0099] In conclusion, the SDRs and the SIRs in Table 1 demonstrate the periodicity coded models have improved performances in the separated heart sounds. The SDRs and the SIRs in Table 2 demonstrate the PC-DCAE model has improved performances in the separated lung sounds. The embodiment described above separates the heart sounds and the lung sounds in the sound dataset constructed from the sound library, and the separated heart sounds and the lung sounds have better performance in the signal recovery quality and the signal separation clarity. 4. Configurations of sound separation systems

[0100] FIG. 9 is a schematic illustration of a system 20 in accordance with an embodiment of the present disclosure. The system 20 comprises a sound collector 21 and a mobile device 221, and a processor 222. The sound collector 21 can be placed directly on the chest, the abdomen, or other auscultation sites to perform an auscultation. The sound collector 21 transmits the sound signals to the mobile device 221 by wireless communication or cable. The sound collector 21 may be capable of performing auscultation on many different auscultation sites at the same time.

[0101] The mobile device 221 can receive the sound signals from the sound collector 21. Each of the sound signals from the sound collector 11 may comprise at least two sound signals originated from different sources. The mobile device 221 can be a mobile phone, a personal computer, or other computational device with communication functions and can be carried by an individual. The mobile device 221 may transform the sound signals into spectrums, and wirelessly sends the spectrums to the processor 222 in a cloud infrastructure. The mobile device may also simply transmit the sound signals to the cloud infrastructure, without performing any transformation.

[0102] The cloud infrastructure may be physically distant from the mobile device 221 or the sound collector 21. The cloud infrastructure comprises the processor 222 for separating sound signals, and the processor 222 comprises an encoder, a Fourier transformer, a latent space cluster, a decoder, and an inverse Fourier transformer. The processor 222 may further comprise a transformation unit if the mobile device 221 does not transform the sound signals into the spectrums. The function of the processor 222 and the procedure of the sound separation are described in FIG. 2 and 4. After the clustered signals are reconstructed by the inverse Fourier transformer in the processor 222, the cloud infrastructure may transmit the clustered signals to the mobile device 221.

[0103] The mobile device 221 may comprise a user interface so that a physician or the patient may retrieve the clustered signals from the mobile device 221. The physician or the patient may listen to the clustered signals to analyze, monitor, or diagnose various disorders related to the auscultation site.

[0104] FIG. 10 is a schematic illustration of a device 30 in accordance with an embodiment of the present disclosure. The device 30 is capable of performing auscultation and processing the sound signals. The device 30 can be directly placed on the chest, the abdomen, or other auscultation sites to perform an auscultation, and may be capable of performing auscultation on many different auscultation sites at the same time. The sound signals collected by the device 30 can be processed by a processor inside the device 30, and the sound signals originated from different sources can be separated by the procedure described and illustrated in FIG. 4. The physician may retrieve the clustered sound signals from the device 30 to monitor, analyze, or diagnose various disorders.

[0105] FIG. 11 is a schematic illustration of a system 40 in accordance with an embodiment of the present disclosure. The system 40 comprises a device 41 and a cloud infrastructure 42. The device 41 can be placed directly on the chest, the abdomen, or other auscultation sites to perform an auscultation. The device 41 may be capable of performing auscultation on many different auscultation sites at the same time.

[0106] The device 41 may transmit the sound signals to the cloud infrastructure 42 by wireless communication or cable, or the device 41 may transform the sound signals into spectrums, and wirelessly sends the spectrums to the cloud infrastructure 42.

[0107] The cloud infrastructure 42 may be physically distant from the device 41. The cloud infrastructure 42 may comprise an encoder, a Fourier transformer, a latent space cluster, a decoder, and an inverse Fourier transformer. The cloud infrastructure 42 may further comprise a transformation unit if the device 41 does not transform the sound signals into the spectrums. The functions of the cloud infrastructure 42 and the procedure of the sound separation are described in FIG. 2 and 4. After the clustered signals are reconstructed by the inverse Fourier transformer, the cloud infrastructure 42 may transmit the clustered signals to the device 41. [0108] The device 41 may comprise a user interface so that a physician or the patient may retrieve the clustered signals from the device 41. The physician or the patient may listen to the clustered signals to analyze, monitor, or diagnose various disorders related to the auscultation site.

[0109] The embodiments shown and described above are only examples. Many details are often found in the art such as the other features of a circuit board assembly. Therefore, many such details are neither shown nor described. Even though numerous characteristics and advantages of the present technology have been set forth in the foregoing description, together with details of the structure and function of the present disclosure, the disclosure is illustrative only, and changes may be made in the detail, including in matters of shape, size and arrangement of the parts within the principles of the present disclosure up to, and including the full extent established by the broad general meaning of the terms used in the claims. It will therefore be appreciated that the embodiments described above may be modified within the scope of the claims.