Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS AND APPARATUS FOR DETERMINING THE QUALITY OF SEALING OF PACKAGING
Document Type and Number:
WIPO Patent Application WO/2005/105580
Kind Code:
A2
Abstract:
A method for determining the quality of sealing of an article, in particular the quality of thermic sealing of packages comprising a heat-sealable film. The method comprises the steps of. collecting data on a sound wave generated by sealing apparatus during sealing; and performing a multivariate analysis on the collected data to determine the quality of sealing of the article. The data are collected by one or more microphones positioned adjacent to the sealing jaws of the apparatus. The invention also provides apparatus specifically adapted to carry out methods according to the invention.

Inventors:
BAUDAT GASTON (CH)
Application Number:
PCT/IB2005/001460
Publication Date:
November 10, 2005
Filing Date:
April 29, 2005
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MARS INC (US)
BAUDAT GASTON (CH)
International Classes:
B65B57/00; G01M3/24; G01M7/02; G01N29/14; G01N29/44; (IPC1-7): B65B57/00; G01M3/24; G01M7/02; G01N29/14
Domestic Patent References:
WO1994022614A11994-10-13
WO2004099751A22004-11-18
Foreign References:
RU2123687C11998-12-20
US5992600A1999-11-30
Other References:
PATENT ABSTRACTS OF JAPAN vol. 2000, no. 26, 1 July 2002 (2002-07-01) & JP 2001 240024 A (FUJI MACH CO LTD), 4 September 2001 (2001-09-04) cited in the application
Attorney, Agent or Firm:
James, Anthony Christopher W. P. (43-45 Bloomsbury Square, London WC1A 2RA, GB)
Download PDF:
Claims:
CLAIMS
1. A method for determining the quality of sealing of an article, comprising the steps of: collecting data on a sound wave generated by sealing apparatus during sealing; and performing a multivariate analysis on the collected data to determine the quality of sealing of the article.
2. A method according to claim 1, wherein the step of collecting comprises collecting data for a sound wave at a plurality of locations, each location providing data for at least one variable used in the multivariate analysis.
3. A method according to claim 2, wherein the step of collecting comprises collecting data at a first location and collecting data at a second location, wherein the first location is located closer to the sealing apparatus than the second location.
4. A method according to claim 2 or claim 3, further comprising, after the step of collecting, the step of subtracting data collected at a first location from data collected at a second location to remove background noise before the step of performing a multivariate analysis.
5. A method according to any one of the preceding claims, wherein the step of performing a multivariate analysis comprises analysing the collected data with stored reference data representative of at least one level of quality of sealing.
6. A method according to any one of claims 1 to 5, wherein the step of performing a multivariate analysis comprises: transforming the collected data into a frequency domain; and determining the quality of sealing by consideration of the data in the frequency domain.
7. A method according to claim 6, wherein the step of transforming comprises performing a Fast Fourier Transform (FFT) on the collected data.
8. A method according to claim 6 or claim 7, wherein the step of determining comprises multiplying the data in the frequency domain with a Gaussian function.
9. A method according to claim 6 or claim 7, wherein the step of manipulating comprises multiplying the data in the frequency domain with a Harming window.
10. A method according to any one of claims 6 to 9, wherein the step of determining further comprises performing an inverse Fast Fourier Transform (FFT) on the collected data.
11. A method according to claim 1, wherein the step of performing a multivariate analysis comprises: producing a kdimensional feature vector from the from the collected data; and determining the quality of sealing by comparison of the feature vector with a plurality of target vectors, each target vector indicative of a quality of sealing.
12. A method according to claim 11, wherein the step of determining the quality of sealing comprises determining that target vector which is closest to the feature vector.
13. A method according to claim 12, further comprising the step of returning a given quality of sealing if components of the feature vector meet a predetermined criterion.
14. A method according to claim 13, wherein the components of the feature vector are determined to meet said predetermined criterion if the closest target vector has a predetermined relationship with the feature vector.
15. A method according to claim 14, wherein the predetermined criterion is different depending on which target vector is closest to the feature vector.
16. A method according to any one of claims 13 to 15, wherein the components of the • feature vector are determined to meet said predetermined criterion if the difference between each individual component of target vector and the corresponding component of the feature vector is within a predetermined range,.
17. A method according to any one of claims 12 to 16, wherein at least one neural network is used to perform the comparison of the feature vector with a plurality of target vectors.
18. A method according to claim 17, wherein, for every target vector, the distance to the feature vector is calculated using one neuron or neuronlike part.
19. A method according any one of claims 11 to 18, wherein the step of producing the kdimensional feature vector comprises deriving the feature vector using stored statistical data to map the collected data to a feature vector.
20. A method according to claim 19, further comprising the step of updating the statistical data representative of the target class on a manual identification of the quality of sealing of an article.
21. A method according to any one of claims 11 to 20, further comprising the step of updating the target vectors following a manual identification of the quality of sealing of an article.
22. A method according to any one of the preceding claims, wherein the sound wave is an ultrasound wave.
23. A method according to any one of the preceding claims, wherein the article is a package.
24. A method according to any one of the preceding claims, wherein the article is a package containing confectionery.
25. A method according to any one of the preceding claims, wherein the sealing apparatus is a thermal sealing device.
26. Apparatus for determining the quality of sealing of an article, comprising: a transducer for generating data on a sound wave generated by sealing apparatus during sealing; and a processing unit connected to the a transducer and configured to perform a multivariate analysis on the generated data to determine the quality of sealing of the article.
27. Apparatus according to claim 26, further comprising a plurality of additional transducers at a plurality of locations for generating data on the sound wave, each transducer providing data for at least one variable processed in the processing unit.
28. Apparatus according to claim 26, wherein the plurality of transducers comprises a first transducer and a second transducer, wherein the first transducer is located closer to the sealing apparatus than the second transducer.
29. Apparatus according to claim 28, wherein the processing unit is further configured to subtract data generated by the second transducer from data generated by the first transducer to remove background noise.
30. Apparatus according to any one of claims 26 to 29, further comprising; a memory connected to the processing unit for storing reference data representative of at least one level of quality of sealing, wherein the processing unit is further configured to analyse the generated data in conjunction with the stored reference data.
31. Apparatus according to claim 30, wherein the processing unit is further configured to receive identification data corresponding to a manual identification of the quality of sealing of the article and process the collected data in. association with the identification data to output the reference data for storing in the memory. . .
32. Apparatus according to any one of claims 26 to 31, wherein the processing unit is further configured to transform the generated data into a frequency domain and determine the quality of sealing by consideration of the data in the frequency domain.
33. Apparatus according to claim 32, wherein the processing unit is configured to perform a Fast Fourier Transform (FFT) on the generated data.
34. Apparatus according to claim 32 or claim 33, wherein the processing unit is configured to perform multiply the generated data in the frequency domain with a Gaussian function.
35. Apparatus according to claim 32 or claim 33, wherein the processing unit is configured to multiply the generated data in the frequency domain with a Harming window.
36. Apparatus according to any one of claims 32 to claim 35, wherein the processing unit is configured to perform an inverse Fast Fourier Transform (FFT) on the generated data.
37. Apparatus according to claim 26, wherein the processing unit is configured to generate a kdimensional feature vector from the from the collected data and determine the quality of sealing by comparing the feature vector with a plurality of target vectors, each target vector indicative of a quality of sealing.
38. Apparatus according to claim 37, wherein the processing unit is further configured to determine that target vector which is closest to the feature vector.
39. Apparatus according to claim 38, wherein the processing unit is further configured to output a value representing the quality of sealing if components of the feature vector meet a predetermined criterion.
40. Apparatus according to claim 39, wherein the processing unit is further configured to determine whether the components of the feature vector meet said predetermined criterion if the closest target vector has a predetermined relationship with the feature vector.
41. Apparatus according to claim 40, wherein the predetermined criterion is different depending on which target vector is closest to the feature vector.
42. Apparatus according to claim 41, wherein the components of the feature vector are determined to meet said predetermined criterion if the difference between each individual component of target vector and the corresponding component of the feature vector is within a predetermined range.
43. Apparatus according to any one of claims 37 to 42, wherein the processing unit forms at least one neural network to perform the comparison of the feature vector with a plurality of target vectors.
44. Apparatus according to any one of claims 37 to 43, wherein the processing unit comprises a plurality of subsidiary processing modules, wherein each subsidiary processing module forms a neural network to perform the comparison feature for each target vector in parallel.
45. Apparatus according to any one of claims 37 to 44, wherein the processing unit is configured to derive the feature vector .using stored statistical data to map the collected data to a feature vector.
46. Apparatus according to any one of claims 26 to 45, wherein the sound wave is an ultrasound wave.
47. Apparatus according to any one of claims 26 to 46, wherein the article is a package.
48. Apparatus according to any one of claims 26 to 47, wherein the article is a package containing confectionery.
49. ■.
50. Apparatus according to any one of claims 26 to 48, wherein the sealing apparatus is a thermal sealing device.
51. A method for improving the detection capability of a processing unit for detecting the quality of sealing performed by sealing apparatus on an article, comprising the steps of: collecting data at a processing unit on a sound wave generated by sealing apparatus during sealing of an article; manually identifying the quality of sealing of the article to the processing unit; processing the collected data in association with the identified quality of sealing to produce reference data corresponding to the identified quality of sealing; and storing the reference data in memory connected to the processing unit for accessing by the processing unit during a determination of the quality of sealing of articles.
52. A method according to claim 50, wherein the step of identifying comprises identifying an acceptable or unacceptable quality of sealing and the step of processing comprises generating a first output of reference data representative of an acceptable quality of sealing and a second output of reference data representative of an unacceptable quality of sealing; and storing said first and second outputs.
53. A method according to claim 51, wherein the step of generating comprises comparing the collected data, identified quality of sealing and previously stored first and second outputs.
54. A method according to claim 52, wherein the step of processing is performed separately for the first stored output and second stored output.
55. A method according to claim 53 wherein each separate step of processing is carried out on a separate neural network.
56. A method according to claim 53 or claim 54, wherein each separate processing step is performed in parallel.
57. A method according to claim 53, wherein each separate processing step is performed in series.
58. A method according to any one of claims 51 to 56, wherein the processing step is adjusted to compensate for changes in the first and second stored outputs.
59. A method according to any one of claims 50 to 57, wherein the pressure wave is an 5 ultrasonic wave.
60. A method according to any one of claims 50 to 58, wherein the article is a package.
61. A method according to claim 59, wherein the article is a package containing 10 confectionery.
62. A method according to any one of claims 50 to 60, wherein the sealing apparatus is a thermal sealing device.
63. 15 62.
64. A method for determining the quality of sealing of an article substantially as hereinbefore described with reference to the accompanying drawings.
65. Apparatus substantially as hereinbefore described with reference to the accompanying drawings. 0.
66. A method for improving the detection capability of a processing unit substantially as hereinbefore described with reference to the accompanying drawings.
Description:
PACKAGING METHODS AND APPARATUS

The present invention relates to methods and apparatus for detecting defective seals in packages formed by automated packaging equipment. The invention relates in particular to the detection of defective heat seals formed by such equipment.

A wide range of packaging equipment relies on forming seals in flexible, heat-sealable films to close the package after filling. For example, one common packaging technique is form-fill-seal packaging, including vertical form-fill-seal (VFFS) packaging. In VFFS packaging, a flat web of flexible, heat-sealable film is unwound from a roll and formed into a continuous tube in a tube-forming section, by sealing the longitudinal edges on the film together to form a so-called lap seal or a so-called fin seal. The tube thus formed is pulled vertically downwards to a filling station. The tube is then collapsed across a transverse cross-section of the tube, the position of such cross-section being at a sealing device below the filling station. A transverse heat seal is made, by the sealing device, at the collapsed portion of the tube, thus making an air-tight seal across the tube. The material being packaged enters the tube above the transverse heat seal in a continuous or intermittent manner, thereby filling the tube upwardly from the transverse heat seal. The tube is then allowed to drop a predetermined distance usually under the influence of the weight of the material in the tube. The jaws of the sealing device are closed again, thus collapsing the tube at a second transverse section, which may be at, above or below the air/material interface in the tube, depending on the nature of the material being packaged and the mode of operation of the process. The sealing device seals and severs the tube transversely at the second transverse section. The material-filled portion of the tube is now in the form of a pillow shaped pouch. Thus, the sealing device has sealed the top of the filled pouch, sealed the bottom of the next-to-formed pouch and separated the filled pouch from the next-to-be formed pouch, all in one operation. Variations on pouch-forming machines and in particular on this type of vertical form fill and seal apparatus are either known or conceivable. For example, the forming and sealing functions may be performed separately from severing function on separate machines. Also, the jaws of the sealing device could move to the next sealing position rather than have the film drop to the next position or there could be two sets of sealing jaws that seal both transverse ends simultaneously. Further, instead of forming a tube, two pieces of film could be fed into the machine and the pouch could be made by four seals, two longitudinal and two transverse. It will also be appreciated that form-fill-seal equipment can be operated in non-vertical mode, for example in horizontal mode, especially for packaging unitary items, for example as shown in JP-A-2001240024.

In addition to the form-fill-seal equipment described above, automated packaging equipment in widespread use includes equipment for filling and sealing stand-up pouches. That is to say, pouches formed from flexible, heat-sealable film having front and back panels and a bottom panel forming a gusset to allow the pouch to stand up after filling. The pouches are individually filled, and then sealed along the top edges of the front and back panels.

Other automated packaging equipment is used to seal a flexible, heat-sealable film cover over a thermoformed tray containing the product to be packaged.

The most commonly used sealing device used to form the various seals described above is a thermic sealing device. In such a device, the thermoplastic films to be sealed together are clamped between electrically preheated jaws to press and melt together at least a bonding layer of the thermoplastic films to be sealed along a sealing strip defined by the dimensions of the heated portion of the jaws. The jaws are then opened, and a knife or other cutting device is used to separate successive pouches by cutting along the median line of the sealed strip.

Another sealing device commonly used is a so-called impulse sealer which has a sealing element mounted in sealing jaws and electrically insulated therefrom. In operation, the sealing jaws are closed and an electrical current is caused to flow through a sealing element e.g. a wire, for a fraction of the time that the jaws are closed. The jaws.remain closed during a cooling or equilibration period, during which the seal forms, before the sealing jaws are opened. Impulse sealing is especially suitable for through-liquid sealing, fox example in the sealing of milk pouches in the dairy industry. It is important that the cooling of the seal takes effect before the weight of the liquid can weaken or rupture the bottom seal. The objective of substantially all packaging processes is to provide uniform seals of predetermined strength' that are continuous, and air- and liquid-impermeable. Such reliability is especially important in food packaging applications in order to avoid microbial contamination of the packaged foodstuffs. Unfortunately, no sealing method is 100% reliable. Thermic sealing especially can produce defective seals, in particular when particles of the material being packaged are trapped between the sealing jaws of the apparatus. A method is needed to detect imperfect seals produced by automated packaging equipment, so that the defective packages can automatically be removed from the production line for further checking and/or rejection.

At present there is no satisfactory method for detecting imperfect seals. Most production lines rely on visual inspection of the packages, optionally supplemented, by taking sample packages from the line for a more detailed inspection. The more detailed inspection may include compressing the packages with a predetermined pressure and looking for leakage or collapse of the package. The pressure may be pulsed, and/or it may be applied while the package is immersed in liquid so that small leaks are detectable by the formation of bubbles. It will readily be appreciated that these methods are cumbersome, and that any method based on statistical sampling will fail to find all defective packages.

JP-A-2001240024 describes a method of detecting faulty seals that are formed in form-fill- seal packaging of rice crackers due to trapping of a rice cracker between the jaws of the sealing apparatus. The Japanese patent describes using a digital filter to analyse the intensity of the sealing sound in a predetermined frequency band, preferably centered on a frequency in the range 12-2OkHz. The measured intensity in this band is compared with a reference value for normal sealing. The reference value is determined from the sealing sound measured for a number of normal seals. The apparatus detects the loud sound that is produced when the machine tries to seal across a rice cracker. This technology has not been used commercially, and indeed the great majority of sealing processes do not produce ■ an operator-audible variation in the sealing sound between good and bad seals.

It is an object of the present invention to provide a method and apparatus for detecting defective seals formed during thermal sealing in automated packaging equipment. It is a further object of the invention to provide such a method and apparatus for detecting defective seals that is applicable to a wide range of different packaging methods and products.

It is a further object of the invention to provide such a method and apparatus for detecting defective seals that is capable of distinguishing different failure modes of the packaging operation.

It is a further object of the invention to provide such a method and apparatus for detecting defective seals that is reliable even in the presence of background noise.

It is a further object of the invention to provide such a method and apparatus for detecting defective seals that is effective even when defective sealing results in only very small variations in the sealing noise profile.

In a first aspect, the present invention provides method for determining the quality of sealing of an article, comprising the steps of: collecting data on a sound wave generated by sealing apparatus during sealing; and performing a multivariate analysis on the collected data to determine the quality of sealing of the article.

The term "multivariate analysis" means that more than one property of the sound produced by the sealing operation is analysed by the method. Suitable properties for such analysis include the intensity or the energy density of the sound, which may be measured in one or more frequency bands, at one or more locations relative to the sealing jaws, and over one or more predetermined periods relative to the timing cycle of the sealing mechanism. A plurality of such parameters, or linear combinations thereof, are subjected to multivariate analysis to identify defective sealing. Preferably more than two variables are analysed, more preferably more than three variables, for example at least 5, 10, 20 or 50 or more variables may be analysed.

Typically, the step of performing a multivariate analysis comprises analysing the collected data with stored reference data representative of at least one level of quality of sealing. Suitably, a plurality of variables obtained from the sealing sound wave are compared with a corresponding plurality of reference variables in order to diagnose at least one property of the sealing operation.

The plurality of reference variables are usually derived by analyzing the sound waves of a plurality of "normal" sealing operations, that is to say sealing operations that can be determined to be satisfactory by subsequent physical inspection of the seals. In preferred embodiments, the methods according to the present invention farther comprise the feature of continuously updating the reference variables using data from normal sealing operations during the operation of the sealing system in order to allow for instrumental drift and/or changes in background noise.

The plurality of measured variables may, for example, be derived by collecting data for the sound wave at a plurality of locations, each location providing data for at least one variable used in the multivariate analysis.

The sound wave data are normally collected by means of a microphone. The methods according to the present invention preferably comprise steps of pre-processing and conditioning the sound signal. These steps may comprise filtering and gain control, for example as described in "Adaptive HR Eiltering in Signal Processing_and Control" by Phillip A. Regalia (Marcel Dekker, 1995), the entire content of which is incorporated herein by reference.

In certain embodiments according to this aspect, the method may comprise collecting data from a plurality of differently located microphones. For example, the step of collecting may comprise collecting data at a first location and collecting data at a second location, wherein the first location is located closer to the sealing apparatus than the second location. These embodiments may comprise, after the step of collecting, the step of subtracting data collected at a first location from data collected at a second location; to remove background noise (adaptive noise cancellation). Further multivariate analysis can then be carried out on the two signals to provide a more accurate analysis of the sealing properties than could be achieved with a single microphone. The multivariate analysis may comprise analysis of more than one variable of the sound wave measured at each location. For example, in certain embodiments, the step of performing a multivariate analysis comprises transforming the collected data into a frequency domain followed by determining the quality of sealing by consideration of the data in the frequency domain. The step of transforming suitably comprises performing a Fast Fourier Transform (FFT) on the collected data. The step of determining the quality of sealing by consideration of the data in the frequency domain may for example comprise multiplying the data in the frequency domain with a Gaussian function, or it may comprise multiplying the data in the frequency domain with a Harming window. In other embodiments, the step of determining the quality of sealing by consideration of the data in the frequency domain may for example comprise performing an inverse Fast Fourier Transform (FFT) on the collected data.

In the methods according to the present invention, the step of performing a multivariate analysis may comprise: producing a k-dimensional feature vector from the from the collected data; and determining the quality of sealing by comparison of the feature vector with a plurality of target vectors, each target vector indicative of a quality of sealing. Suitably, the step of determining the quality of sealing comprises determining that target vector which is closest to the feature vector. Methods according to these embodiments, suitably further comprise the step of returning a given quality of sealing if components of the feature vector meet a predetermined criterion. For example, the components of the feature vector may be determined to meet said predetermined criterion if the closest target vector has a predetermined relationship with the feature vector. In such methods, the predetermined criterion may be different depending on which target vector is closest to the feature vector. Suitably, the components of the feature vector may be determined to meet said predetermined criterion if the difference between each individual component of target vector and the corresponding component of the feature vector is within a predetermined range.

The sound wave variables may be analysed using a parametric approach and/or a non- parametric approach. A description of such methods may be found in the following references, the contents of which are incorporated herein by reference: Non-parametric approaches to analysis of the sound wave parameters could include Linear Discriminant Analysis (LDA), its non-linear version the Generalised Discriminant analysis(GDA), neural networks, Support Vectors Machines (SVM), and genetic programming.

In some of these embodiments, at least one neural network is used to perform the comparison of the feature vector with a plurality of target vectors.1 Preferably, for every target vector, the distance to the feature vector is calculated using one neuron or neuron- like part.

Preferably, the step of producing the k-dimensional feature vector comprises deriving the feature vector using stored statistical data to map the collected data to a feature vector. Preferably, the methods further comprise the step of updating the statistical data representative of the target class on a manual identification of the quality of sealing of an article.

The methods may further comprise the step of updating the target vectors following a manual identification of the ^quality of sealing of an article.

The measured sound wave may comprise components in the range from about IHz to about 10OkHz, for example from about 10Hz to about 5OkHz. The analysis of the sound wave is normally performed on-line.

Preferably, the article comprises at least one layer of heat-sealable film. For example, the article may be a pouch such as a stand-up pouch or a form-fill-seal pouch formed from heat sealable film. It is envisaged that the present invention will be especially useful in the food and beverage packaging industries, wherein the article is a package containing a foodstuff or a beverage.

In a second aspect, the present invention provides an apparatus for determining the quality of sealing of an article, comprising; a transducer .for generating data on a sound wave generated by sealing apparatus during sealing; and a processing unit connected to the a transducer and configured to perform a multivariate analysis on the generated data to determine the quality of sealing of the article. The transducer is typically a microphone.

The apparatus according to this aspect of the invention may further comprise a plurality of additional transducers at a plurality of locations for generating data on the sound wave, each transducer providing data for at least one variable processed in the processing unit. For example, the plurality of transducers may comprise a first transducer and a second transducer, wherein the first transducer is located closer to the sealing apparatus than the second transducer. In these embodiments, the processing unit may be further configured to subtract data generated by the second transducer from data generated by the first transducer to remove background noise.

Preferably, the apparatus according to this aspect of the invention further comprises a memory connected to the processing unit for storing reference data representative of at least one level of quality of sealing, wherein the processing unit is further configured to analyse the generated data in conjunction with the stored reference data. In these embodiments, the processing unit is preferably further configured to receive identification data corresponding to a manual identification of the quality of sealing of the article and process the collected data in association with the identification data to output the reference data for storing in the memory.

The processing unit may be further configured to transform the generated data into a frequency domain and determine the quality of sealing by consideration of the data in the frequency domain. For example, the processing unit may be configured to perform a Fast Fourier Transform (FFT) on the generated data. The processing unit may be configured to multiply the generated data in the frequency domain with a Gaussian function or with a Hanning window. Alternatively or additionally, the processing unit may be configured to perform an inverse Fast Fourier Transform (FFT) on the generated data.

The processing unit may be configured to generate a k-dimensional feature vector from the collected data and determine the quality of sealing by comparing the feature vector with a plurality of target vectors, each target vector being indicative of a quality of sealing. For example, the processing unit may be further configured to determine that target vector which is closest to the feature vector. In these embodiments, the processing unit may be further configured to output a value representing the quality of sealing if components of the feature vector meet a predetermined criterion. For example, the processing unit may be further configured to determine whether the components of the feature vector meet said predetermined criterion if the closest target vector has a predetermined relationship with the feature vector.

In these embodiments, the predetermined criterion is preferably different depending on which target vector is closest to the feature vector. For example, the components of the feature vector may be determined to meet said predetermined criterion if the difference between each individual component of target vector and the corresponding component of the feature vector is within a predetermined range.

In certain embodiments according to the second aspect of the invention, the processing unit forms at least one neural network to perform the comparison of the feature vector with a plurality of target vectors. For example, the processing unit may comprise a plurality of subsidiary processing modules, wherein each subsidiary processing module forms a neural network to perform the comparison feature for each target vector in parallel.

Suitably, the processing unit is configured to derive the feature vector using stored statistical data to map the collected data to a feature vector.

In a third aspect, the present invention provides a method for improving the detection capability of a processing unit for detecting the quality of sealing performed by sealing apparatus on an article, comprising the steps of: collecting data at a processing unit on a sound wave generated by sealing apparatus during sealing of an article; manually identifying the quality of sealing of the article to the processing unit; processing the collected data in association with the identified quality of sealing to produce reference data corresponding to the identified quality of sealing; and storing the reference data in memory connected to the processing unit for accessing by the processing unit during a determination of the quality of sealing of articles. In accordance with this embodiment, the step of identifying suitably comprises identifying an acceptable or unacceptable quality of sealing and the step of processing comprises generating a first output of reference data representative of an acceptable quality of sealing and a second output of reference data representative of an unacceptable quality of sealing; and storing said first and second outputs.

In accordance with the third aspect of the invention, the step of generating preferably comprises comparing the collected data, identified quality of sealing and previously stored first and second outputs. Preferably, the step of processing is performed separately for the first stored output and second stored output. For example, each separate step of processing may be carried out on a separate neural network. Each separate processing step may be performed in parallel or in series. In certain embodiments, the processing step is adjusted to compensate for changes in the first and second stored outputs.

Techniques of multivariate analysis applied to banknote validation systems are described in the following patent applications by the present inventor: WO95/15540, US-A-5522491, WO94/12951, WO00/33262 and WO2004/001685. It will be appreciated that these methods may with suitable adaptation be applied to the present multivariate analysis of sound waves from packaging- systems,- and accordingly the whole contents of these applications are expressly incorporated herein by reference.

It is expected that the present invention will allow much wider application of sound analysis to the detection of seal quality in automated packaging equipment. In particular, the present invention will solve the hitherto intractable problem of detecting poor seals produced by thermal sealing equipment, such as FFS equipment. The present invention will enable the detection and analysis of sound variations well below the threshold of what is detectable to the ear, even in the presence of background noise and system drift. The multivariate analysis also opens the door to detection , diagnosis, and/or prediction of more than one failure mode, thereby improving process efficiency and reliability.

Certain embodiments of the invention will now be described in more detail, with reference to the accompanying drawings, in which: Figure, 1 shows a schematic representation of a digital filter; Figure 2 shows the amplitude versus frequency response for a second order Butterworth digital filter; Figure 3 shows a schematic representation of an adaptive noise cancellation system; Figure 4 shows loci for different Mahalanobis's distances; Figure 5 shows a general schematic for a neural network; Figure 6 shows an example of a neural network (BPNN) with three layers: four input neurons, three hidden neurons and three output neurons.

Sealing processes are performed in large machines and in a noisy environment. Therefore the first step is to pre-process the signal(s) from the microphone(s) for enhancing as much as possible the information and rejecting the noise, leading toward a high signal to noise ratio (SNR). Yet often such noise has a strong quasi-periodic structure or at least a narrow band spectrum, which can be handled with adaptive noise cancellation techniques. While one microphone at the right location can be enough, an array of them would be a solution for extracting signal and noise from remote locations using interferometry.

Filtering

An obvious way for improving the SNR is filters for extracting only the useful part of the signal spectrum. In the present embodiments we shall use the convenient Fourier and Z transforms as preferred digital filters. Of course it will be understood that a necessary hardware anti-aliasing filter and amplifiers are used prior the sampling to condition the signal and meet the Nyquist theorem as much as possible.

A digital filter is a simple convolution operation * in the time domain and a product in the frequency and Z domains. Equation (1) and (2) describes a general filter in time and Z domains:

(1) y[n] = h[n] * xfnj = y]x[kjh[n — kj Time domain > convolution

(2) Y(z) = H(z)X(z) Z transform > product

Where x[n] is the input signal to be filtered and y[n] the output filtered signal. The integer n and k are sample indexes. A filter of this type is shown schematically in Fig. L The signal can be real OF complex, yet here we will assume real signals if not otherwise noted. hfnj is named the filter impulse response and in Z/frequency domain the transfer function H(z), H(f), They describe the overall behavior of the filter.

The most general linear type of filter are HR (Infinite Impulse Respond) where HfzJ can be expressed as a ratio of two polynomials (3). The resulting series h[n] has an infinite number of terms.

In the time domain we have the linear constant coefficient difference equation (4):

(4) yfnj = ∑ bfkjxfn - k] -∑a[k]y[n - k] k=0 4=1

When p ~ 0 the corresponding impulse response is limited to q + 1 terms and it is not recursive any more. Such filters are named FIR (Finite Impulse Response).

The following example is the design of a 2nd order HR low pass (LP) filter using the bilinear Z transform (BZT) technique and a Butterworth analog prototype filter (ωc = 1). Its analog transfer function is describes by the Laplace transform (5). Using the BZT substitution (6) we retrieve the digital filter (7). Fig. 2 shows its amplitude versus frequency response for a sampling rate of about 893 Hz (sampling period 560 μs) and a cut off frequency of 50 Hz in this example.

(5) Analog LP prototype filter

(6) BZT substitution

With : fc = 50Hz, fs =

1 l + l - z^ + z'2 (7) H^(Zj = 40.5999 1 - 1.5096 -z~λ + 0.6081 - z~2

When the size of the convolution (4) is large (typically 25 terms or more) it would be faster to work in the frequency domain using the relation (8). That can be implemented very efficiently using the fast Fourier transform algorithm (FFT) and its inverse (IFFT).

(8) Y(f) = H(f)X(f)

Many other methods exist for the implementation of digital filter, including multi-rate approaches and filter banks. Of course, the design of a filter such LP, band pass (BP), band stop (BS), and so on, will be a function of the signal versus noise spectrum in the specific application.

Adaptive noise cancellation

Although standard filters can provide good solutions, there are cases where the noise spectrum is changing through time (non-stationary signals), or significantly overlaps the signal bandwidth. With some digital filtering tricks we can cancel noises or at least increase significantly the SNR. An alternative solution is to collect the background noise away from the signal source and assume that there is no correlation between signal and noise. Such collection can provide different noise than the one at the signal location but yet related by some unknown phase shift and amplitude relationship. With two microphones, or more, we can do this using an adaptive digital filter technique like that shown in Fig. 3.

Here nfkj is the noise added at the signal collection source, while nfkj ϊs the noise collected elsewhere (away of the signal source) but supposed related with n[ k] through an unknown yet linear transfer function Hn(z). Finally xfkj and n[k] are the estimations of the actual noise and signal. Some adaptive algorithms in change the filter coefficients in real time ("on fly") sample after sample for minimizing the error between x[k] and xfkj . There are plenty of solutions proposed in the literature to achieve such goal. The basic one below is described for the purpose of example and understanding. First we define a figure of merit has:

Since noise and signal are not correlated we can write:

(10) *l =σl +σ\

Where:

(11) σε2 =E[(n[k] -ή[k])2}

Obviously when Hn(z) is right then σε — 0 .

The basic algorithm searches to minimize the power (variance) of x[k] using some Least

Mean Square (LMS) procedure. Of course σ] is estimated empirically with some short

time methods like sliding windows (15).

One very popular method uses the gradient Vσ? (12) and updates the filter coefficients (a

& b) accordingly (13. a), (13.b):

(13.a) ai[k] = a,[k-\]- i e {lr..,p]

(13.b) b,[k] = bt[k-\]-μ i e {l,..,q}

Where μ ≥ 0 is a step factor controlling the algorithm convergence speed. Of course if a constant amplitude periodic noise is synchronized with the signal-sampling rate it can be seen as a DC component at the time of sampling (aliased to 0 Hz). Such cases are easy to cancel especially if we can trig the signal acquisition with some external 5 source synchronized with the sealing machine assumed to be responsible of the noise itself.

Interferometry

Sometimes it is not possible to locate the microphone near the source of signal or noise. To 10 focus on it anyway we would need a very directive microphone, which could be expensive or difficult to build. Yet another way to be direction sensitive is to use an array of microphones and apply basic interferometry techniques. In these embodiments, the signals form the microphones, optionally filtered first, are summed with variable delays and gains in order to define a specific directional pattern, like a narrow angle of sensitivity. 15 . The equation (14) describes the basic concept where dk are delays (integers), Gt gains, XkfnJ the Uh microphone's signal and yfnj the composite signal (a virtual directive microphone) result of processing an array of N microphones:

20. (14) y[n] = Yβk - xk[n].z-d>

Gain control

Finally the pre-processing could be adapted to control the overall signal levels. This 25 procedure is often needed to be sure that we have signals in range for the next step of the processing and to avoid, any hardware clipping or poor SNR. There are numerous approaches to do so, using either some normalization scheme meaning the normalized signal is not amplitude sensitive anymore, or automatic gain control (AGC).

0 Although such task can be perform by the feature extraction step we describe it at this stage for the sake of simplicity. The objective is to keep the signals always in the same range regardless of external fluctuations. A classical assumption is when those fluctuations are slower than the signal and a relatively slow tracking, and adjustment will be able to cancel such drifts. The cancellation can be done either by software applying some correction gains, or by a direct feedback to the acquisition hardware like amplifiers. Yet some information has to be extract in order to estimate the levels. A basic example is the signal power estimation using the empirical variance or RMS (Root Mean Square) value. Equation (15) provides estimation using a sliding average of M samples. Digital filters can be used as well like a first order HR LP for such estimation:

The AGC procedure will compute a corrective gain GAGC based to a reference signal power

Feature extraction

In order to be able to recognize and classify the signal we need to extract some features from it. Those can be based in its intensities, its spectrum, its power structure, a model, a time-frequency analysis, or more complex features. At the end we build a decision vector £ where each component is one feature and the last step will be in charge of recognition and classification of such vector.

Here we describe just 2 possible approaches but it is understood that general signal processing methods are suitable for such tasks include, but not exclusively: a bank of filters, sliding FFT & and periodigrams, wavelet transforms, model based methods, correlation, matched filters, and Kalman's filters.

(a) Parametric approaches There are two fundamental types of method, the parametric ones and the non-parametric. Let us start with the first type. In such case we assume some underlying model for the signal production. This is a very popular technique often used in speech processing and time series analysis. There are many models like:

ARMA: Auto Regressive Moving Average ARMAX: Auto Regressive Moving Average with eXogenous variable AR: Auto Regressive ARX: Auto Regressive with eXogenous variable Box- Jenkins: General polynomial model

For the sake of simplicity we are going to use an AR example. Such model assumes that the signal is a filtered version of a broadband noise (named "white" noise), or sometimes a periodic impulse excitation. Therefore our signal x[n] is described by (17):

(17) x[n] = Yja[k]x[n - k] + e[n]

Where efkj is a white noise uncorrelated with the signal. Such a recursive scheme is also named LPC (Linear Predictive Coding). The AR algorithm estimates the a[k] filter coefficients. In its basic version it is just a LMS method based to the minimization of the residual noise power. Those coefficients can be seen as features describing the signal. Therefore the recognition step can use them to make a decision.

(b) Non-parametric approaches

Here we do not assume any model for the signal; we just extract some features like its spectrum or its power in some frequency bands.

Cross-correlation methods are also widely used. They compare the signal x[n] to some empirical references rτ[n] using a relation like (18):

Signals are supposed null outside the range {0, ..., N-I }.

Obviously a special simple case is to take the raw signal samples themselves as the

features with or without decimation and/or interpolation.

Now let us take an example using a FFT and the Parseval's theorem to estimate the signal

power in 3 bands, low (LF), medium (MF) and high frequency (HF). Of course we assume

those are the right figures of merit here.

(19) Xf(k) ke{θ,..,N-l}

The equation (19) is the classical discrete Fourier transform with the property

Xj(k)-Xf*(—k). It is expressed in term of normalized frequencies (or bin) such that

sampling frequency.

Now we define the 3 bands with their normalized frequency limits (excluding in this case

the DC bin):

LF B1:0<k≤k1

MF B2:k1<k≤k2

HF B2:k2<k≤k3

WHhJc1Kk2Kk^N-I.

Power estimation for each band is computed as (20):

In that example ej, e.2, and es are the 3 features extracted from the signal. They allow to build a 3D-vector £ (21) for the recognition task.

Yet another technique would be the Feature Vector Selection (FVS) which can be used to extract relevant samples in linear or non-linear cases.

Recognition and classification

The last task is to recognize the signal(s) using its (their) features and assign it (them) to some classes. As in the feature extraction step above, there are two general types of approach: - Parametric, meaning we have a model of the data statistical distribution(s) Non-parametric, meaning no prior knowledge of the data statistical distribution(s).

The analysis further depends on whether we know, or at least have samples from, only the good reference class, named class 0 in the following discussion. That means the class where the sealing process was acceptable. Such a situation is known as a one class problem. Iri other situations we know the classe(s), namely class 1, 2, ..., and so on, where the sealing process was bad. Such situation is known as multi-class problem. We even have situations where we know the class 0 and just some of the other classes but not all. Finally we can see the one class problem as a special two classes case where the class 1 is totally unknown.

When we deal with more than one feature it is important to use multivariate statistics since almost always they are significant correlations between features, the components of Jc . Failure to recognize this could seriously impact the recognition performances. Marginal distributions are only one side of the reality, the joint distribution is the key. Parametric approaches

A parametric approach assumes that we have knowledge of the underling of the data. The most common distribution is the normal (Gaussian) case. Yet there are many more distributions along with their combinations, so this analysis is problem specific and must be part of a data analysis up front of any design. For the sake of simplicity here we are going to assume that our data comes from a normal distribution, which does not limit the scope of the discussion what so ever. Equation (22) introduces the p dimensions multivariate normal distribution (density) in a matrix form:

The quadratic form A2 (23) is named the Mahalanobis distance (MD). Where Cx is the covariance matrix and x the mean vector [5].

The MD is always positive or null and the figure 4 shows the locus of a constant MD for p-2.

It can been shown that A2 follows a chi-square rf laws of p degrees of freedom. Therefore we can set up a threshold MDu, in order to reject only a few amount of the population when the MD is larger than it (see chi-square tables). This means we can accept and assume that

any sample below this threshold belongs to the class defines by the statistics Cx and In practice we can deal we estimators of the actual covariance matrix and mean vector.

In the case of one class problem the MD method can be easily applied if the data are normal. This means that every sample outside the decision region MD>MDth are rejected. As always statistical decisions are defined in function of risk assessment and management. If you would like to minimize the risk of accepting a defective seal, then MD& must be small, but in the other hand you take the risk of rejecting good samples with no seal problem. To minimize that risk MDa1 must be larger! This is a trade off that can not be avoided and it is up to the user to decide which is the rule of engagement and strategy.

With a multi-class case among many techniques we can use a Bayesian approach This assumes N classes (c,,Λ ,cN) having a priori probabilities p(cλ),K ,p(cN) and the probability of the sample Jf is p($c). The conditional probability is:

(24) p$lCj) = Pj (£)

For a given sample the posterior probability of the classy, p(c} Ix) is the probability that £

belongs to the class/ and is given by the Bayes formula:

A Bayesian classifier of N classes compares a posteriori probabilities P^c11 Jc), p(c21 Jc)5K ,p(cN l Jc) and assigns the sample Jc to the class with the largest a

posteriori probability. It also equivalent to find the maximum of p(Cj)Pj (x) since the

denominator is the same for all the classes.

In the case of a normal distributions for any class/ we get:

Maximizing p(Cj )p} (x) is equivalent to minimizing its logarithm:

In practice the co variance matrix and the mean are estimated from the samples. If the covariance matrix Cx are different the assignment rule is quadratic (27).

In the case of equal covariance matrices Cx] = CX = ... = CX — Cx it becomes linear:

(28) +lnptCj)

Non-parametric approaches

Non-parametric cases are more complex than parametric ones. Not only we face the classical problem of statistical estimation, yet we need to discover and describe empirically either the distributions using only the data, or the decision rule for directly defining some boundary in the sample space.

Of course it is possible to reuse Bayes's method if we can express the distributions using density estimation techniques. But unfortunately often the Bayes rule in non-parametric problems is too complex, and exhibits bias and variance errors, not to mention that it could be very time-consuming as well. To overcome such issues other tools have been developed such that Linear Discriminant Analysis (LDA), its non-linear version the Generalized Discriminant Analysis (GDA) and, neural networks, Support Vectors Machines (SVM), genetic programming, and kernel methods. SVM are very interesting since they implement statistical learning theory, and especially the structural risk minimization scheme.

Although it is not our purpose to describe any methods in details, we will give here an example of machine learning using a type of neural network named "back-propagation" (BPNN) or multi-layer perceptron neural network: This type of neural network has a structure of neurons disposed in layers. A neuron as shown in Fig. 5 receives inputs and first calculates a weighted sum of them, then applies a non-linear activation (transfer) function for evaluation its output. The connection between two neurons i and/ is denoted by wfJ (weight). Each neuron k calculates the sum of input:

Where d is the dimension of the input vector of the neuron k, the output of the neuron is obtained by applying the activation function to s.

An example of such a neural network is shown schematically in Fig. 6. Briefly, for each measured sample during the learning phase, each output is evaluated and compared to the desired output The error for the output neuron i is defined by the difference between the output O1 of the network and the desired output tt . The global error for N output neurons is defined by:

The factor 1A is introduced to simplify the calculation of the derivatives. The aim of the optimization algorithm named "back propagation" is to adapt the weight at each iteration in order to minimize the current error. At the iteration /+1, each weight Wy is adapted according to the following rule:

(31) ^(f + l) = w,(0 + ff-£-^-

Where ε is the learning rate parameter.

The first step of the algorithm is the learning phase which consists of initializing (generally at random) the weights and adapting them after each presentation of a sample. The samples are presented and the weights adjusted until they do not vary anymore. The choice of the parameter ε influences the convergence of the algorithm to a stable configuration. If ε is so small the convergence is slow, and if ε is large the algorithm may oscillated and not converged. This parameter is in general adjusted empirically.

The second step is the test phase or validation, which allows evaluating the capacity of the network to generalize for samples never seen during learning. The output of the network is evaluated and compared to the desired output. The percent of examples well recognized shows the performance of the model. The weights are not adjusted is this step. It has been shown that such BPNN can converge towards a good empirical estimation of the a posteriori probabilities p(Cj Ix) .

Any one or more of the results of the pre-processing, feature extraction, and/or recognition can be used to adjust automatically the packaging machine settings using a feedback loop. This is especially useful when dealing with effects of ageing and dust. At the very least the method and apparatus of the invention can raise an alarm asking the operator to do some maintenance. For instance Kohonen's self-organized map (SOM) is one way to monitor the overall system status and performances. Finally any statistics, parameters and models can be adapted using the samples in real time to follow population and system drifts.

Many other embodiments falling within the scope of the accompanying claims will be apparent to the skilled reader.