A METHOD AND APPARATUS FOR AUTOMATICALLY DETERMINING VOLATILE ORGANIC COMPOUNDS (VOCS) IN A SAMPLE

Title:

A METHOD AND APPARATUS FOR AUTOMATICALLY DETERMINING VOLATILE ORGANIC COMPOUNDS (VOCS) IN A SAMPLE

Document Type and Number:

WIPO Patent Application WO/2016/187671

Kind Code:

Abstract:

The present invention relates to a method and system for automatically determining volatile organic compounds (VOCs) in a sample by inputting the sample into a chamber, emitting infrared light from an optical light source into the chamber with the sample, detecting at a detector a detected infrared light from the chamber, transforming the detected infrared light to a Fourier Transform Infrared (FTIR) spectrum of the sample at a processor, wherein the FTIR spectrum has wavenumbers between 670 and 800cm-1, and processing the FTIR spectrum to determine each of the VOCs in the sample.

Inventors:

NAIDU RAVENDRA (AU)
WANG LIANG (AU)
CHEN ZULIANG (AU)
MALLAVARAPU MEGHARAJ (AU)

Application Number:

PCT/AU2016/050413

Publication Date:

December 01, 2016

Filing Date:

May 26, 2016

Export Citation:

Click for automatic bibliography generation Help

Assignee:

CRC CARE PTY LTD (AU)

International Classes:

G01N21/35; G01J3/42; G01J3/433; G06N3/02

Domestic Patent References:

WO2013093913A1	2013-06-27
WO2012153326A1	2012-11-15

Foreign References:

US6140647A

2000-10-31

Attorney, Agent or Firm:

PHILLIPS ORMONDE FITZPATRICK (333 Collins StreetMelbourne, Victoria 3000, AU)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1 . A method of automatically determining volatile organic compounds (VOCs) in a sample, the method including:

inputting the sample into a chamber;

emitting infrared light from an optical light source into the chamber with the sample;

detecting at a detector a detected infrared light from the chamber;

transforming the detected infrared light to a Fourier Transform Infrared (FTIR) spectrum of the sample at a processor, wherein the FTIR spectrum has wavenumbers between 670 and 800cm^"1 ;

performing baseline correction of the FTIR spectrum using an object orientated baseline correction algorithm implemented by the processor to: divide the FTIR spectrum into a designated number of segments; collect points on the FTIR spectrum representing wavenumbers with lowest absorbance values for each of the segments; disregard ones of the points on the FTIR spectrum with absorbance values higher than an average absorbance value for each segment; and generate a baseline corrected FTIR spectrum by connecting remaining ones of the points on the FTIR spectrum;

processing the baseline corrected FTIR spectrum to identify sub-bands having sub-band peaks at respective wavenumbers in a second derivative curve of the FTIR spectrum using a second derivation algorithm implemented by the processor; and processing the sub-bands using a neural network algorithm implemented by the processor that has been trained to determine each of the VOCs in the sample based on a result of a comparison of training data indicative of known sub-band peaks at known wavenumbers for the VOCs in the FTIR spectrum applied to the neural network algorithm and the sub-band peaks at respective wavenumbers of the sub-bands.

2. A method of claim 1 , wherein each the designated number of segments have a designated segment gap and the baseline correction algorithm is further implemented to determine whether two of the remaining ones of the points are located closer than the segment gap and to disregard one of the two points with a higher absorbance value.

3. A method of claim 1 or 2, wherein the VoCs in the sample include benzene, toluene, ethylbenzene, and xylenes (BTEX) components.

4. A method of any one of claims 1 to 4, further including filtering the baseline corrected FTIR spectrum using a Gaussian filter algorithm implemented by the processor to remove ones of the sub-bands having sub-band valleys higher than a threshold value in the second derivative curve.

5. A method of claim 4, wherein the Gaussian filter algorithm is a Gaussian low pass filter algorithm.

6. A method of any one of claims 1 to 5, further including optimising identification of the sub-bands in the second derivative curve using an optimisation algorithm implemented by the processor to minimise a difference between a smoothed second derivative curve and the second derivative curve having the identified sub-bands.

7. A method of claim 6, wherein the optimisation algorithm can be expressed as:

where absorbance is the second derivative curve.

8. A method of claim 7, wherein the optimisation algorithm is one of Davidon Fletcher Powell (DFP) method, Nelder-Mead (NM) method, Levenberg Marquardt (LM) method, Minimax, Linear search and Genetic algorithm (GA).

9. A method of any one of claims 1 to 8, wherein the neural network algorithm is a Back Propagation Neural Network (BPNN).

10. A method of any one of claims 1 to 9, further including smoothing the FTIR spectrum before performing the step of baseline correction using a low pass filter.

1 1 . An apparatus for automatically determining volatile organic compounds (VOCs) in a sample, the apparatus including:

a housing;

a chamber disposed in the housing for inputting the sample therein;

an optical light source disposed in the housing for emitting infrared light into the chamber with the sample;

a detector for detecting a detected infrared light from the chamber; and a controller disposed in the housing having a processor and a memory in data communication with the processor, the controller being configured to:

transform the detected infrared light to a Fourier Transform Infrared (FTIR) spectrum of the sample, wherein the FTIR spectrum has wavenumbers between 670 and 800cm^"1 ;

perform baseline correction of the FTIR spectrum using an object orientated baseline correction algorithm residing on the memory and implemented by the processor to: divide the FTIR spectrum into a designated number of segments; collect points on the FTIR spectrum representing wavenumbers with lowest absorbance values for each of the segments; disregard ones of the points on the FTIR spectrum with absorbance values higher than an average absorbance value for each segment; and generate a baseline corrected FTIR spectrum by connecting remaining ones of the points on the FTIR spectrum;

process the baseline corrected FTIR spectrum to identify sub-bands having sub-band peaks at respective wavenumbers in a second derivative curve of the FTIR spectrum using a second derivation algorithm residing on the memory and

implemented by the processor; and

process the sub-bands using a neural network algorithm residing on the memory and implemented by the processor that has been trained to determine each of the VOCs in the sample based on a result of a comparison of training data indicative of known sub-band peaks at known wavenumbers for the VOCs in the FTIR spectrum applied to the neural network algorithm and the sub-band peaks at respective wavenumbers of the sub-bands.

12. A system of claim 1 1 , wherein each the designated number of segments have a designated segment gap and the baseline correction algorithm is further

implemented by the processor to determine whether two of the remaining ones of the points are located closer than the segment gap and to disregard one of the two points with a higher absorbance value.

13. A system of claim 1 1 or 12, wherein the VoCs in the sample include benzene, toluene, ethylbenzene, and xylenes (BTEX) components.

14. A system of any one of claims 1 1 to 13, wherein the controller is further configured to filter the baseline corrected FTIR spectrum using a Gaussian filter algorithm residing on the memory and implemented by the processor to remove ones of the sub-bands having sub-band valleys higher than a threshold value in the second derivative curve.

15. A system of claim 14, wherein the Gaussian filter algorithm is a Gaussian low pass filter algorithm.

16. A system of any one of claims 1 1 to 15, wherein the controller is further

configured to optimise identification of the sub-bands in the second derivative curve using an optimisation algorithm implemented by the processor to minimise a difference between a smoothed second derivative curve and the second derivative curve having the identified sub-bands.

17. A system of claim 16, wherein the optimisation algorithm can be expressed as: where absorbance is the second derivative curve.

18. A system of claim 16, wherein the optimisation algorithm is one of Davidon Fletcher Powell (DFP) method, Nelder-Mead (NM) method, Levenberg Marquardt (LM) method, Minimax, Linear search and Genetic algorithm (GA).

19. A system of any one of claims 1 1 to 18, wherein the neural network algorithm is a Back Propagation Neural Network (BPNN).

20. A system of any one of claims 1 1 to 19, wherein the controller is further configured to smooth the FTIR spectrum before performing the step of baseline correction using a low pass filter.

Description:

A method and apparatus for automatically determining volatile organic compounds (VOCs) in a sample

Technical Field

[0001 ] The present invention relates to a method and system for automatically determining volatile organic compounds (VOCs) in a sample. In particular, but not exclusively, the invention relates to determining benzene, toluene, ethylbenzene and xylenes (BTEX) components of the VOCs.

Background of Invention

[0002] Petroleum products such as gasoline and diesel fuel contain various Volatile Organic Compounds (VOCs), and many of these are carcinogens. Some of the most dangerous VOCs in petroleum products and natural gas are benzene, toluene, ethylbenzene and xylenes (o- m-, p-), and these are known as BTEX components. Workers may be exposed to BTEX components during, for example, refining operations, gasoline storage, shipment and retail operations, chemical manufacturing, plastics and rubber manufacturing, shoe manufacturing, printing and activities in chemical laboratories. Accordingly, manufacturing companies have attempted to manage BTEX emissions in accordance with their country's

environmental protection and occupational health and safety regulations.

[0003] Employing exemplary existing environmental analysis approaches, BTEX components are firstly sampled onto adsorbing cartridges in the field before examinations are conducted with laboratory-based thermal desorption and gas chromatography, e.g. mass spectrometry (GC-MS) equipped with Photo-lonization Detection (PID). Field samples can be collected using active vapour sampling (TO- 15) or passive vapour sampling (TO-17). These approaches, however, suffer from disadvantages, such as cost, sample degradation, lengthy processing time, cross contaminations etc., and it provides only a 'snapshot' in time compared to the lengthy administration requirements for this process. Also, large numbers of samples are required to get representative temporal variations. Some of these disadvantages can be solved by using an existing portable GC-MS, which is small and battery powered. However, the transfer line connections of a portable GC-MS system are fragile and spinning MS turbo pumps could be damaged by excessive vibrations, and GC-MS systems are expensive.

[0004] An alternative existing method for measuring BTEX components employs a Fourier Transform Infrared Spectroscopy (FTIR) device. These FTIR devices apply the Fourier Transform algorithm to transform time domain infrared data into a frequency domain based on, say, a Michelson Interferometer. Typically, the Infrared wave length between 2.5 to 20um (wavenumber 500-4000 cm ^"1), which is located at the mid-infrared area, is used to predict each of the petroleum hydrocarbons in a sample individually. Furthermore, compared to portable GC-MS, portable FTIR is cheaper and remains stable during field tests and real time monitoring.

[0005] The existing FTIR devices measure an infrared absorption spectrum.

Based on quantum theory, the vibration of an isolated molecule occurs at a single frequency when it absorbs or emits energy, which gives rise to the vibrational spectrum. Unfortunately, each vibrating molecule interacts with other surrounding molecules at a slightly different frequency. Thus, the observed FTIR spectrum line shape (band) typically consists of a series of more or less overlapping bands representing these absorbed or scattered individual molecules. To detect an unknown hydrocarbons mixture sample, extracting information and identifying the components from an overlapping IR spectrum is a key issue for FTIR devices. For example, an existing FTIR laboratory-based system applies thermal isolation techniques to isolate the hydrocarbons based on their volatilization characteristics including boiling points. However, it is costly to apply this technique on a portable FTIR for online in situ monitoring of the hydrocarbons in an open area. Alternatively, to extract quantitative information from such overlapping spectra, numerical techniques, known as curve-fitting, band decomposition, etc. have been used. These existing FTIR devices still require the separation of heavily overlapped bands to be performed by an experienced user with some knowledge about the system being studied. Indeed, the most widespread existing method of band decomposition of the infrared spectrum waveforms for chemical bonds or species identification is visual inspection by a user of an FTIR device, which can be slow and unreliable. [0006] Reference to any prior art in this specification is not, and should not be taken as, an acknowledgment or any form of suggestion that this prior art forms part of the common general knowledge in any country.

Summary of Invention

[0007] In one aspect of the present invention, there is provided a method of automatically determining volatile organic compounds (VOCs) in a sample, the method including: inputting the sample into a chamber; emitting infrared light from an optical light source into the chamber with the sample; detecting at a detector a detected infrared light from the chamber; transforming the detected infrared light to a Fourier Transform Infrared (FTIR) spectrum of the sample at a processor, wherein the FTIR spectrum is has wavenumbers between 670 and 800cm-1 ; processing the FTIR spectrum to identify sub-bands having sub-band peaks at respective wave numbers in a second derivative curve of the FTIR spectrum using a second derivation algorithm implemented by the processor; performing baseline correction of the sub-bands FTIR spectrum using an object orientated baseline correction algorithm implemented by the processor to: divide the FTIR spectrum into a designated number of segments; collect points on the FTIR spectrum representing wavenumbers with lowest absorbance values for each of the segments; remove disregard ones of the points on the FTIR spectrum with absorbance values higher than an average absorbance value for each segment; continuous ones of the sub-bands from discontinuous ones of the sub- bands and generate a baseline corrected FTIR spectrum by connecting remaining ones of the points on the FTIR spectrum; processing the baseline corrected FTIR spectrum to identify sub-bands having sub-band peaks at respective wave numbers in a second derivative curve of the FTIR spectrum using a second derivation algorithm implemented by the processor; and processing the discontinuous ones of the sub- bands using a neural network algorithm implemented by the processor that has been trained to determine each of the VOCs in the sample based on a result of a

comparison of training data indicative of known sub-band peaks at known

wavenumbers for the VOCs in the FTIR spectrum applied to the neural network algorithm and the discontinuous ones of the sub-band peaks at respective

wavenumbers of the discontinuous ones of the sub-bands. [0008] In another aspect of the present invention, there is provided an apparatus for automatically determining volatile organic compounds (VOCs) in a sample, the apparatus including: a housing; a chamber disposed in the housing for inputting the sample therein; an optical light source disposed in the housing for emitting infrared light into the chamber with the sample; a detector for detecting a detected infrared light from the chamber; and a controller disposed in the housing having a processor and a memory in data communication with the processor, the controller being configured to: transform the detected infrared light to a Fourier Transform Infrared (FTIR) spectrum of the sample, wherein the FTIR spectrum has wavenumbers between 670 and 800cm ^"1 ; perform baseline correction of the FTIR spectrum using an object orientated baseline correction algorithm residing on the memory and

implemented by the processor to: divide the FTIR spectrum into a designated number of segments; collect points on the FTIR spectrum representing wavenumbers with lowest absorbance values for each of the segments; disregard ones of the points on the FTIR spectrum with absorbance values higher than an average absorbance value for each segment; and generate a baseline corrected FTIR spectrum by connecting remaining ones of the points on the FTIR spectrum; process the baseline corrected FTIR spectrum to identify sub-bands having sub-band peaks at respective

wavenumbers in a second derivative curve of the FTIR spectrum using a second derivation algorithm residing on the memory and implemented by the processor;

[0009] Preferably, the VoCs in the sample include benzene, toluene,

ethylbenzene, and xylenes (BTEX) components, and the system automatically determines BTEX vapours using the FTIR spectrum. The system inspects the infrared spectrum at wavelengths between 12.5 and 15um (wavenumber 670-800 cm ^" ¹). This spectral region is referred to hereinafter as the 'fingerprint' region where BTEX components can be represented with peaks at different locations. For example, peaks were located at the following wavenumbers of 673, 697, 728, 740, 768 and 795 cm ^"1, for benzene, toluene, ethylbenzene and (o- m- p-) xylene, respectively. It will be appreciated by those persons skilled in the art that the FTIR spectrum is a graph of infrared light (IR) absorbance or transmittance at different wavelengths of the IR light.

[0010] In an embodiment, each the designated number of segments have a designated segment gap and the baseline correction algorithm is further implemented to determine whether two of the remaining ones of the points are located closer than the segment gap and to disregard one of the two points with a higher absorbance value. Thus, the baseline correction removes continuous ones of the sub-bands from discontinuous ones of the sub-bands

[001 1 ] In an embodiment, the method further includes filtering the baseline corrected FTIR spectrum using a Gaussian filter algorithm implemented by the processor to remove ones of the sub-bands having sub-band valleys higher than a threshold value in the second derivative curve. Preferably, the Gaussian filter algorithm is a Gaussian low pass filter algorithm.

[0012] In an embodiment, the method further includes optimising identification of the sub-bands in the second derivative curve using an optimisation algorithm implemented by the processor to minimise a difference between a smoothed second derivative curve and the second derivative curve having the identified sub-bands. Preferably, wherein the optimisation algorithm can be expressed as: where absorbance is the second derivative curve.

[0013] In another embodiment, wherein the optimisation algorithm is one of Davidon Fletcher Powell (DFP) method, Nelder-Mead (NM) method, Levenberg Marquardt (LM) method, Minimax, Linear search and Genetic algorithm (GA). [0014] That is, an embodiment of the method includes four steps: baseline correction, noise filter processing using Gaussian filter algorithms, band

decomposition (curve fitting) and Neural Network determination. The baseline correction algorithm can incorporate four different formulae: constant, linear, quadratic and cubic formulae, which are better suited to different applications. For instance, constant and linear may not be suitable for high frequency regions, while quadratic and cubic may actually over-fit some potential peaks. In one example, BTEX mixture vapours were normally detected with low signal to noise ratio and high frequency spectra. Over-fitting the peaks, which might indicate the BTEX components, would result in important information being lost to further analysis. Consequently, the baseline correction algorithm was developed for high frequency spectrum regions with less over-fitting of potential peaks and is suitable for identifying BTEX components in the 'fingerprint' region (wavenumber 670-800 cm ^"1). Moreover, as above, the baseline correction algorithm is an object oriented algorithm that draws a baseline based on the spectrum itself. Also, the FTIR spectrum may be smoothed before performing the step of baseline correction using a low pass filter.

[0015] In the embodiment, the Gaussian signal filtering algorithms are employed to smooth the signals after baseline correction. For band decomposition, the location, amplitude and width of the sub-bands were determined with optimization algorithms, described above, and these sub-bands were initialized with a 2nd derivative curve. Finally, the levels of BTEX compounds were determined via the Back Propagation Neural Network (BPNN) using the amplitudes and the locations of the identified sub- bands.

Brief Description of Drawings

[0016] In order that the invention can be more clearly understood, examples of embodiments will now be described with reference to the accompanying drawings, in which:

[0017] Figure 1 shows a representation of an apparatus for automatically determining volatile organic compounds (VOCs) in a sample, according to an embodiment of the invention; [0018] Figure 2 shows a block diagram of a method of automatically determining volatile organic compounds (VOCs) in a sample, according to an embodiment of the invention;

[0019] Figure 3 shows the FTIR spectrum for individual BTEX components in a sample obtained according to an embodiment of the invention;

[0020] Figure 4 shows the effect of a Gaussian low pass filter with four standard deviations on FTIR spectrum of a sample having BTEX components;

[0021 ] Figure 5 shows baseline correction being applied to the FTIR spectrum of Figure 4;

[0022] Figure 6 shows identifying sub-band of the baseline corrected FTIR spectrum of Figure 5 using a second derivation curve;

[0023] Figure 7 shows curve fitting result using Minimax optimization method being applied to the sub-bands of the FTIR spectrum of Figure 6;

[0024] Figure 8 shows validation results for BPNN predictions of the sub-bands of Figure 7 versus using existing mass spectrometry (GC-MS);

[0025] Figure 9 shows FTIR spectrum data for a petrol sample A obtained according to existing mass spectrometry (GC-MS) and obtained according to an embodiment of the invention;

[0026] Figure 10 shows spectrum data for petrol sample B obtained according to existing mass spectrometry (GC-MS) and obtained according to an embodiment of the invention;

[0027] Figure 1 1 shows spectrum data for petrol sample C obtained according to existing mass spectrometry (GC-MS) and obtained according to an embodiment of the invention; and

[0028] Figure 1 1 shows spectrum data for petrol sample D obtained according to existing mass spectrometry (GC-MS) and obtained according to an embodiment of the invention. Detailed Description

[0029] According to an embodiment of the present invention there is provided an apparatus 10 for automatically determining volatile organic compounds (VOCs) in a sample, as shown in Figure 1 . The apparatus 10 includes a housing 12, a chamber 14 disposed in the housing 12 for inputting the sample therein at a sample inlet 20. The sample inlet 20 can also be configured to remove the sample from the chamber 20. The housing 12 includes an optical light source 16 disposed in the housing 12 for emitting infrared light into the chamber 14 with the sample and a detector 18 for detecting a detected infrared light from the chamber (not shown is a controller disposed in the housing having a processor and a memory in data communication with the processor).

[0030] The controller is configured to perform the following steps to determine VOCs - particularly, BTEX compounds - in the sample by implementing the following steps: transform the detected infrared light to a Fourier Transform Infrared (FTIR) spectrum of the sample, wherein the FTIR spectrum has wavenumbers between 670 and 800cm ^"1 ; perform baseline correction of the FTIR spectrum using an object orientated baseline correction algorithm residing on the memory and implemented by the processor to: divide the FTIR spectrum into a designated number of segments; collect points on the FTIR spectrum representing wavenumbers with lowest absorbance values for each of the segments; disregard ones of the points on the FTIR spectrum with absorbance values higher than an average absorbance value for each segment; and generate a baseline corrected FTIR spectrum by connecting remaining ones of the points on the FTIR spectrum; process the baseline corrected FTIR spectrum to identify sub-bands having sub-band peaks at respective

wavenumbers in a second derivative curve of the FTIR spectrum using a second derivation algorithm residing on the memory and implemented by the processor; and process the sub-bands using a neural network algorithm residing on the memory and implemented by the processor that has been trained to determine each of the VOCs in the sample based on a result of a comparison of training data indicative of known sub-band peaks at known wavenumbers for the VOCs in the FTIR spectrum applied to the neural network algorithm and the sub-band peaks at respective wavenumbers of the sub-bands.

[0031 ] As described, preferably the VOCs are BTEX components, which are determined as mixture vapours in the sample. This apparatus 10 inspects the infrared spectrum wavelength between a fingerprint region of 12.5 and 15um

(wavenumber 670-800 cm ^"1) using FTIR. As demonstrated in Figure 3, which shows the output of the apparatus 10, in the 'fingerprint' region, BTEX components can be represented with peaks at different locations: the highest peaks were located roughly at the following wavenumbers of 673, 697, 728, 740, 768 and 795 cm ^"1, for the BTEX components of benzene, toluene, ethylbenzene and (o- m-, p-) xylene, respectively.

[0032] Figure 2 shows a flow chart of a method 100 of automatically determining volatile organic compounds (VOCs) in the sample. The method 100 includes initially inputting 102 the sample into a chamber, emitting infrared (IR) light into the chamber with the sample, and detecting a detected IR light from the chamber. The method 100 then includes: transforming 104 the detected IR light to a Fourier Transform Infrared (FTIR) spectrum of the sample, wherein the FTIR spectrum is between 670 and 800cm-1 ; performing 106 baseline correction of the FTIR spectrum using an object orientated baseline correction algorithm implemented by the processor to: divide the FTIR spectrum into a designated number of segments; collect points on the FTIR spectrum representing wavenumbers with lowest absorbance values for each of the segments; disregard ones of the points on the FTIR spectrum with absorbance values higher than an average absorbance value for each segment; and generate a baseline corrected FTIR spectrum by connecting remaining ones of the points on the FTIR spectrum; processing 108 the baseline corrected FTIR spectrum to identify sub- bands having sub-band peaks at respective wavenumbers in a second derivative curve of the FTIR spectrum using a second derivation algorithm implemented by the processor; and processing 1 10 the sub-bands using a neural network algorithm implemented by the processor that has been trained to determine each of the VOCs in the sample based on a result of a comparison of training data indicative of known sub-band peaks at known wavenumbers for the VOCs in the FTIR spectrum applied to the neural network algorithm and the sub-band peaks at respective wavenumbers of the sub-bands.

[0033] In an embodiment, the method further includes the step of noise filter processing, such as a Gaussian filter algorithm to remove ones of sub-bands having sub-band valleys higher than a threshold value in the second derivative curve to derive filtered sub-bands. The filtered sub-bands are then processed using the neural network algorithm to determine each of the VOCs in the sample based on a result of a comparison of training data indicative of known sub-band peaks for the VOCs in the FTIR spectrum applied to the neural network algorithm and filtered sub-band peaks at respective wavenumbers of the filtered sub-bands. Thus, in the embodiment, the method 100 includes four steps: baseline correction, noise filter processing, band decomposition (curve fitting) and Neural Network determination for determining the BTEX compounds in the sample. In another embodiment, the method 100 includes five steps including an additional step of smoothing the FTIR spectrum using a low pass filter before performing baseline correction.

[0034] As above, the step of baseline correction includes four optional formulae: constant, linear, quadratic and cubic. Constant and linear may not be suitable for high frequency regions, while quadratic and cubic may actually misfit some potential peaks. In the embodiment, BTEX mixture vapours were normally detected with low signal noise ratio and high frequency spectra. Accordingly, mis-fitting the peaks, which might indicate BTEX components, would result in important information being lost. The baseline correction algorithm of the embodiment was developed for high frequency spectrum regions with less mis-fitting to the potential peaks thus it is suitable for identifying BTEX components in a sample at the 'fingerprint' region

(wavenumber 670-800 cm ^"1). Furthermore, the baseline correction algorithm of the embodiment is an object oriented algorithm that draws a baseline only which is based on the spectrum itself. Gaussian signal filtering technique was employed to smooth and filter the signals after baseline correction. For Band decomposition, the location, amplitude and width of the sub-bands was determined with optimization algorithms, such as Minimax described below, and these sub-bands were initialized with a 2nd derivative curve. Finally, BTEX compounds were determined via Back Propagation Neural Network (BPNN), using the amplitudes of the identified sub-bands relative to the predefined location of the BTEX components. Following programming and configuration, the apparatus 10 could thus be applied to, say, online in-situ BTEX monitoring in the relevant industries.

[0035] The following example further illustrates aspects of the method 100 and apparatus 10. Four real petrol samples from BP and Caltex were employed in the example as a case study. It can be seen that with proper configuration and

calibration, the apparatus 10 is able to be utilized for online in-situ BTEX monitoring. Specifically, in the example, IR spectra were collected using a Cary 600 series FTIR instrument from Agilent Technologies (Agilent Technologies, Santa Clara, CA, USA), with a 2 cm ^"1 resolution, 32 repeated scans in the 670 to 800 cm ^"1 region, and a 5 KHz speed and 1 .28 KHz filter. Sensitivity was set to 8, aperture was set at open and the range of IR intensity was between 2.8 and 3.4.

[0036] An apparatus of the type shown in Figure 1 as apparatus 10 was used in the example. In the example, the chamber 14 of the apparatus 10 for performing the BTEX analysis has a 2750ml volume and has a 12mm diameter hole on two sides for the detector 18 and the emitter 16, and sealed with plates made of potassium bromide: an IR transparent material. Finally, data processing and analysis were done in MATLAB R2012b using the Statistical Analysis and neural network toolboxes. An Orthogonal Experimental Design (OED) method was used for data processing and is one of the most effective and time-saving methods, and it can minimize the amount of training samples without losing any quality characteristics for specific ions.

[0037] In the example, as a hypothesis, assume each individual VOC component has three different but simple levels of concentrations, of the total amount of these six desired components mixtures being 36= 729. As it is time-consuming to collect the combined total of all concentration levels for each ion, grouping together the calibration dataset with a minimum number of samples and maximum information is a key issue for training a Neural Network determination system effectively. To deal with multifactor experiments, an orthogonal design table (ODT), with reasonable and representative levels of all factors should be determined at first, at least theoretically. The ODT for this study is detailed in Table 1 . Considering the EPA regulations and detection limits of FTIR, for these six chemicals with three levels of concentrations, the number of combinations was set at 18. BTEX solutions were mixed using pure standard solutions of benzene, toluene, ethylbenzene and (o- m- p-) xylene (Sigma Aldrich). Furthermore, 10 random combination mixtures were employed as a testing set to validate the prediction system. The droplets were injected into the chamber 14 from the sample inlet 20 and vaporized. The concentration of each BTEX component was calculated by multiplying the density with the droplet volume then divided by the cubic volume. All measurements were carried out at the same temperature (22°C) in triplicate and the average values were reported for processing.

[0038] Low pass filter and Baseline correction

[0039] Firstly, the raw spectrum data is smoothed using a Gaussian filter. A Gaussian filter is a low pass filter and it has the effect of reducing the high-frequency components, assumed to be noise. Optimizing the standard deviation (std) of the Gaussian function could result in less high-frequency noise for further analyses and minimize the loss of information. Figure 4 illustrates the effect of Gaussian function on the BTEX 'fingerprint' spectrum with various standard deviations. It will be appreciated that the higher the standard deviation, the great the number of peaks that will be smoothed out from the spectrum. Conversely, more peaks will be retained at lower standard deviations, but with a poorer signal to noise ratio. In order to save as much spectrum information as possible, a setting of one std for the low pass filter was chosen. After being passed though the signal filter, the smoothed spectrum data is ready to process with the baseline correction.

[0040] As demonstrated in Figure 5, this object oriented baseline correction algorithm is based only on the object (FTIR spectrum) itself. At first, the whole of the observed spectrum will be divided into the designated number of segments with a designated segment gap. An object orientated baseline algorithm collects the wavenumbers with the lowest absorbance value in each segment as preserved points (shown as open circles in Figure 5). The algorithm will then disregard the preserved points if their absorbance value is higher than the average value of the spectrum absorbance data. For the remaining points, if any two of them are located closer together than the segment gap, the algorithm will only retain the point with lower absorbance value and eliminate the other point. Finally, the baseline can be drawn by simply connecting the remaining points (solid squares) with straight lines.

Moreover, the starting point of the spectrum is connected to the first selected point with a horizontal line, as are the last selected point and the final point in the spectrum. The selection of the segment gap size should be based on the spectrum frequency. The higher the frequency spectrum then the smaller segment gap or more segments are required, and vice versa. Following this, iteratively, if any negative absorbance values have occurred after baseline correction, the system will run the algorithm again based on the previous corrected spectrum data, until all spectrum data have in positive values.

[0041 ] Curve fitting methodology

[0042] Initializing the number of the sub-bands with their approximate locations is the first key issue for band decomposition and curve fitting, and this can be solved using 2nd derivation curve (SDC). As illustrated in Figure 6, the number and location of the valleys of the SDC can be approximately identified as the number and the location of sub-bands. For band decomposition, a large number of bands (peaks) may provide a good visual residual, but some of the peaks may have no source in reality. Starting with a smaller number of bands and increasing the number of bands at scientifically meaningful locations is the general approach for human analysis. However, in the example, for automatic band decomposition, the number of bands cannot generally be as flexible compared to visual inspection. Thus, in the example, the number of bands is established before processing the curve-fitting algorithm. If the bands can be identified by the valleys of the SDC, then the band number can be controlled by eliminating the number of SDC valleys.

[0043] According to Figure 6, the dominant peaks from the original spectrum were presented as the lower valleys in the SDC. Hence, the small valleys, representing the secondary peaks, can be eliminated using a Gaussian low pass filter. Comparing the dashed line and the bold continuous line in Figure 6, it can be seen that some of the valleys were eliminated using the signal filter with 1 .5 std. Since the spectrum signal had already been smoothed before operating the baseline correction and curve fitting, preserving the information should be the higher priority compared to eliminating valleys. In the example, SDC with 1 .5 std was employed to identify the number and location of sub-bands.

[0044] Band decomposition was subsequently performed using various

mathematical optimization algorithms. A set of available alternatives was set up to enable the optimization algorithms select the best element that could solve the problem with regard to some criteria. In the band decomposition scenario of the example, the spectrum band is decomposed using Gaussian curves. A Gaussian function can be expressed as (1 ):

Where: a: The amplitude of the curve; b: The variable of the centre location of the Gaussian curve; c: The Standard Deviation (width) of the curve; d: The constant of y axis compensation.

[0045] Since the spectrum band has been baseline corrected, the parameter d could be fixed at 0 and the available alternatives for our band separation would be: a (amplitude), b (location) and c (standard deviation or width) of the peaks. Since the locations of the peaks has been initialized using SDC, defined as ' χ ₀ ' , a 2 cm ^"1 wavenumber variation range as _Ax c (-2, 2) for the location. The alternative range of amplitude was set from zero to the absorbance value of the smoothed spectrum data at the location x ₀ . Moreover, the width parameter ('c') range was also set from 0 to 2 cm ^"1.

[0046] For comparison, after setting up and randomly initializing the alternatives, mathematical optimization algorithms was employed to select the best parameter values (the parameter 'a', 'b' and 'c'). The purpose is to minimize the difference between the smoothed spectrum band and the separated overlaying sub-bands. Therefore, the function for optimization is (2):

[0047] The employed algorithms here include: Davidon Fletcher Powell (DFP) method, Nelder-Mead (NM) method, Levenberg Marquardt (LM) method, Minimax, Linear search and Genetic algorithm (GA). It will be appreciated by those persons skilled in the art that GA is a type of evolutionary algorithm that mimics the process of natural selection. Unlike from other candidates, instead of having only one initial vector of parameters, GA starts with a number of vectors of parameters, representing each vector as a chromosome. The optimization is processed iteratively using the techniques inspired by natural evolution, such as selection, mutation and crossover. For GA applied in the example, the population was set at 60 chromosomes and the maximum number of generations was set at 1000. For offspring, the parents were selected using Stochastic Universal Sampling, recombined by multi-point crossover and mutating each element with an initial probability {p = 0.7 ). The function was adopted as the fitness function. The lower the value of function / (λ) , the better the fitness of the chromosome.

[0048] For comparison three different standard mixtures of BTEX (Sample No.s V1 , V2, and V3 in Table 1 ) were employed and all methods were processed in triplicate. The band decomposition precision, defined as the average value of f(A) for each method, is illustrated in Table 2. It is clear from Table 2 that Genetic

Algorithm provided the highest precision accuracy in all the methods, whose average value of the three samples was 8.0x10 ^"3. Minimax was the second best candidate, with a similar average value as GA. However, comparing the calculation time consumed by the GA and Minimax, Minimax used only one third of the time that GA required and finished with one thousand iterative calculations. Therefore, Minimax was selected from all the candidates to process the band separation in the example. Using the same spectrum data as demonstrated previously, the band decomposition result using the Minimax optimization method is demonstrated in Figure 7. It can be seen from this figure that some of the small peaks were neglected by the band decomposition since these peaks were eliminated by smoothed SDC after the low pass filter was applied.

[0049] Back Propagation Neural Network (BPNN)

[0050] All of the training samples were processed using the same band

decomposition procedures described above, including low pass filter and baseline correction. The absorbance value of the sub-bands, whose peaks were located roughly at wavenumbers of 673, 697, 728, 740, 768 and 795 cm ^"1, in a variable arrangement of wavenumber ±2 cm ^"1 , were utilized as inputs of the BPNN after being divided by the IR intensity value. The architecture of the BPNN model for

determination was 6*N*6: one input layer with seven neurons (one peak value each); one hidden layer whereby the number of hidden neurons was determined by optimization; and one output layer with six output neurons, corresponding to the six predicted concentrations of BTEX. To optimize the performance of the BPNN, the training parameters were set at a maximum of 300 epochs, with a fixed error goal for the training subset of 0.001 of Root Mean Square Error (RMSE). The robustness and appropriateness of the approach was assessed using the mean of the Relative Error (MRE) of the testing set (e.g. 1 0 synthetic samples), between the predicted and the known concentrations. All neuron numbers of the hidden layer from 2 to 20 were parallel trained, and their performance was compared.

[0051 ] In the example, training algorithms using the BPNN model were tried in MATLAB, as well as three transfer functions: linear; tangent sigmoid; and log sigmoid, five training functions: Bayesian regulation back-propagation; conjugate gradient back-propagation; gradient descent back-propagation; Levenberg-Marquardt back- propagation; and scaled conjugate gradient back-propagation for comparison. After optimization, the architecture of the BPNN model was set at 6x 12x6 for simultaneous determination of the four exchangeable ions. The tangent sigmoid transfer function was used for the hidden layer. The linear transfer function was employed as the output function for the output layer. The weights and biases of the BPNN were randomly initialized before applying the Bayesian Regulation back-propagation training function. [0052] An additional 10 samples containing random combinations of BTEX components at different concentrations were then used for independent validation, as listed in Table 1 . The analysis procedures described above, starting with baseline correction working through to using the BPNN model to predict BTEX composition, were applied to the validation samples in triplicate, and the results are shown in Figure 8. It can be seen that the worst prediction of 14% of the average MRE was obtained for Toluene. With an average MRE of 13.8%, the prediction ability of p- Xylene was similar. However, this approach did provide for a relatively high prediction capability for Benzene and m-Xylene (MRE of 1 1 .6% and 12.1 %, respectively). The predictions for Ethylbenzene and o-Xylene were 12.9% and 13.2%, respectively. Overall, the mean of relative errors between predicted results and known values for the BTEX components were all under 15%.

[0053] As a case study, four different petrol samples A, B, C, and D were utilized. These samples, including unleaded #91 and #98, were collected from petrol service stations in South Australia, Australia. 30μΙ_ of each petrol sample was injected into the chamber 14 and left for three minutes to vaporize at room temperature (22°C). The band decomposition results for these four samples are shown in Figures 9 to 12. GC-MS was used to validate the prediction results from FTIR. The GC setting was as follows: Helium used as carrier gas and set at 70 kPa, and 4-Bromofluorobenzene (25pg/ml in methanol) was used as tuning test standard for tuning verification. The column (30 m χ 0.32 mm ID; Sol-Gel based polyethylene glycol stationary phase 1 pm film thickness) was installed while the initial and final temperature was setup at 40 °C (6 min) and 120 °C, respectively. Injector and interface temperature was set at 200 °C and 230 °C, respectively. MS setting was as follows: scan range (35-270 m/z at 500 m/z s), interval sampling rate (0.5 s), threshold (1000), ionization (electron impact at 70 eV) detector relative to tuning value (0) and solvent cut time (3 min). The BPNN prediction results compared to GC-MS are illustrated in Table 3. It is evident in Table 3 that the highest BTEX component in all petrol samples was

Toluene. Excepting the sample C, Toluene concentrations were all above 1000 mg/m3 according to the BPNN prediction. Total Xylene (sum of the o- m-, p-Xylene) and Ethylbenzene also existed in large amounts in the petrol samples. [0054] According to the example, FTIR can be implemented for online in-situ monitoring of BTEX components using the apparatus 10 for automatically determining BTEX vapours based on the FTIR spectra. As above, the apparatus 10 and the method 100 are able to automatically inspect the infrared spectrum in the BTEX 'fingerprint' region and identify the components. The method includes the following steps of baseline correction, noise filter processing, band decomposition (curve fitting) and BPNN prediction to determine the BTEX components in the sample. The baseline correction algorithm is object oriented and is designed for low signal noise ratio spectrum regions by avoiding the peaks being misfit. After Gaussian signal 'smooth' filtering technique, the 2nd derivative curve initialized the number and location of the sub-bands. The band decomposition (curve fitting) was subsequently fulfilled using the mathematical optimization algorithm, Minimax. Finally, BTEX compounds were determined through the Back Propagation Neural Network (BPNN), using amplitudes of the identified sub-bands relative to the predefined location of the BTEX components.

[0055] Referring back to Figure 2, the method of automatically determining volatile organic compounds (VOCs) in a sample includes the steps of transforming the detected infrared light to a Fourier Transform Infrared (FTIR) spectrum, processing the FTIR spectrum to identify sub-bands, performing baseline correction of the sub- bands, and processing the sub-bands using a neural network algorithm to determine each of the VOCs. In addition, it will be appreciated by those persons skilled in the art that further aspects of the method will be apparent from the above description of the apparatus 10 and the example. Further, the person skilled in the art will also appreciate that at least part of the method could be embodied in program code that implemented by a processor of the apparatus 10. The program code could be supplied in a number of ways, for example on a tangible computer readable medium, such as a disc or a memory.

[0056] Those skilled in the art will also appreciate that the invention described herein is susceptible to variations and modifications other than those specifically described. It is to be understood that the invention includes all such variations and modifications. [0057] The discussion of documents, acts, materials, devices, articles and the like is included in this specification solely for the purpose of providing context for the present invention. It is not suggested or represented that any of these matters formed part of the common general knowledge relevant to the prevent invention as it existed before the priority date of each claim of this application.

Benzene Toluene Ethyl Benezne o-Xylene m-Xylene p-Xylene

Τ 100 300 200 500 500 1000

T2 25 750 1000 500 500 500

Τ3 250 100 200 500 1000 1000

Τ4 250 100 1000 500 200 500

Τ5 100 100 500 1000 1000 500

Τ6 25 300 200 1000 1000 500

Τ7 25 100 500 200 500 1000

Τ8 100 100 1000 1000 500 200

Τ9 25 300 1000 1000 200 1000

Τ10 250 750 500 1000 200 1000

Τ11 100 750 200 200 200 500

Τ12 100 300 500 500 200 200

Τ13 250 300 1000 200 1000 200

Τ14 250 750 200 1000 500 200

Τ15 25 100 200 200 200 200

Τ16 250 300 500 200 500 500

Τ17 25 750 500 500 1000 200

Τ18 100 750 1000 200 1000 1000

V1 25 125 200 200 200 200

V2 100 450 500 500 500 500

V3 250 750 1000 1000 900 1000

V4 200 700 800 500 900 500

V5 150 200 500 400 800 400

V6 50 300 200 250 200 200

V7 250 650 550 200 1000 300

V8 75 780 600 300 350 200

V9 25 150 900 200 200 550

V10 150 100 300 900 700 350

Table 1 : Orthogonal design table for BTEX calibration, T1 -T18 and V1 -V10 represent the nominal compositions of the 18 training and 10 validation synthesized solutions. (Unit: mg/m ³).

Average value of f(X) (*10 ^"3)

Method Sample#1 Sample#9 Sample#18 Average

DFP 13.7 10.0 14.4 12.7

NM 12.5 9.3 10.6 10.8

LM 10.2 8.7 11.9 10.2

Minimax 8.3 7.5 8.6 8.1

Linear 9.9 8.5 9.0 9.1

GA 8.2 7.3 8.7 8.0

Table 2: Variation of accuracy with different optimization methods

DFP - Davidon Fletcher Powell method; NM - Nelder-Mead method; LM - Levenberg

Marquardt method; Linear - Linear search method; GA - Genetic algorithm

Benzene Toluene Ethylbenzene o-Xylene

Sample GC-MS BPNN GC-MS BPNN GC-MS BPNN GC-MS BPNN

A 97.7 83.4 1503 1112 1039 1027.8 218.6 433.5

B 58.2 43.3 1224 1049 671.6 780.3 271.3 358.7

C 91.4 87.4 501.0 737.7 436.6 376.7 89.6 60.8*

D 30.7 16.4 1042 1186 1097.1 1035.6 240.6 358.8

m-Xylene p-Xylene Total Xylene

Calibrations GC-MS BPNN GC-MS BPNN GC-MS BPNN

A 222.3 356.8 713.0 603.7 1260.5 1394

B 185.3 162.5* 650.5 399.9 1038.5921.1

C 148.9 38.5* 323.9 191.5* 530.4 290.8

D 196.2 246.5 388.3 277 634.8 882.3

Table 3: Comparison of BTEX composition values from

BPNN using FTIR data and GC-MS (Unit: mg/m ³)

*- Under the prediction limits of BPNN

Previous Patent: MICROPOROUS POLYMERIC COMPOSITION

Next Patent: HYDRATE INHIBITOR CARRYING HYDROGEL