Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
APPARATUS AND METHOD FOR GENERATING CAPTION DATA
Document Type and Number:
WIPO Patent Application WO/2000/062297
Kind Code:
A1
Abstract:
An apparatus for generating caption data corresponding to digital audio data having a plurality of frames, includes: a caption unit generation block (106) responsive to a user instruction for generating the caption data corresponding to the digital audio data and dividing the caption data into a plurality of caption display units; and a caption data synchronization block (110) for assigning synchronization frame information to each of the caption display units, thereby generating a caption data file having the synchronization frame information and the caption display units. The apparatus includes a user interface screen, which displays the same as caption data displayed on a liquid crystal display (LCD). Therefore, the apparatus can implement a WHAT-YOU-SEE-IS-WHAT-YOU-GET (WYSIWYG) function.

Inventors:
HAN SANG-WOO (KR)
Application Number:
PCT/KR2000/000335
Publication Date:
October 19, 2000
Filing Date:
April 12, 2000
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
KOREA MEDIA CO LTD (KR)
HAN SANG WOO (KR)
International Classes:
G11B27/031; G11B27/034; G11B27/10; G11B27/34; (IPC1-7): G11B27/031
Foreign References:
EP0902431A21999-03-17
JPS6354662A1988-03-09
JPH08212231A1996-08-20
JPH07222055A1995-08-18
Attorney, Agent or Firm:
Choi, Jong-sik (741-40, Yeoksam 1-dong Kangnam-ku Seoul 135-081, KR)
Download PDF:
Claims:
Claims
1. An apparatus for generating caption data corresponding to digital audio data having a plurality of frames, comprising: a caption data generation means for generating the caption data in response to a user instruction; and a sort means for sorting the caption data, generated by said caption data generation means, corresponding to the digital audio data by caption display units.
2. The apparatus as recited in claim 1, further comprising: a synchronization means for generating a caption data file having synchronous frame information by designating the synchronous frame information to the caption data sorted by the caption display units.
3. The apparatus as recited in claim 2, further comprising: a storage means for storing the digital audio data in the form of a digital audio data file.
4. The apparatus as recited in claim 3, wherein said synchronization means further includes: a combination means for combining the caption data file, having the synchronous frame information, outputted from said synchronization means with the digital audio data file outputted from said storage means.
5. The apparatus as recited in claim 1, wherein said sort means further includes: an edition means for editing the caption data by the caption display units in response to the user instruction by checking whether any caption data having a broken word exists among the caption data sorted by the caption display units.
6. The apparatus as recited in claim 5, wherein said sort means includes: a user interface screen having a size change icon for changing a size of each of the caption display units in response to the user instruction.
7. The apparatus as recited in claim 6, wherein said user interface screen includes: a display space for implementing a function of WHAT YOUSEEISWHATYOUGET (WYSIWYG) by displaying the caption data which is the same as that displayed on a liquid crystal display (LCD) contained in a digital audio and caption player.
8. The apparatus as recited in claim 4, wherein said synchronization means includes: a user interface screen for displaying the caption data sorted by the caption display units and the synchronous frame information.
9. The apparatus as recited in claim 8, wherein the synchronous frame information includes a synchronous frame number representing a start point of a frame contained in the digital audio data.
10. The apparatus as recited in claim 8, wherein said user interface screen includes: a playback icon for playing back the digital audio data stored in said storage means in response to the user instruction; a first display space for visually displaying a playback output waveform corresponding to the digital audio data being played back at present; and a second display space for displaying the synchronous frame information corresponding to the digital audio data being played back at present.
11. The apparatus as recited in claim 8, wherein said user interface screen includes: a start point selection icon for selecting a playback start point in response to the user instruction; an end point selection icon for selecting a playback end point in response to the user instruction; and a repeat icon for repeatedly playing back a predetermined section containing a predetermined number of caption display units in response to the user instruction.
12. The apparatus as recited in claim 8, wherein said user interface screen includes: a speed control bar for controlling a playback speed corresponding to the digital audio data in response to the user instruction; and a progressive bar for representing a playback point of the digital audio data.
13. The apparatus as recited in claim 8, wherein said synchronization means corrects the synchronous frame information, designated by the caption display units, by employing a synchronization icon in response to the user instruction.
14. The apparatus as recited in claim 1, wherein the digital audio data includes MP3 data, window media audio (WMA) data, and advanced audio coding (AAC) data.
15. The apparatus as recited in claim 8, wherein said user interface screen further includes a window for inputting the caption data.
16. The apparatus as recited in claim 15, wherein said window for inputting the caption data includes: a first display space for displaying information representing whether the caption data is displayed in a single language or a bilingual; a second display space for displaying a name of the single language or names of the bilingual; and a third display space for displaying the caption data inputted from a user interface device by the caption display unit.
17. The apparatus as recited in claim 16, wherein said window for inputting the caption data further includes: a file open icon for opening a list of caption data files in response to the user instruction; and a storage icon for storing the caption data inputted into the third display space.
18. The apparatus as recited in claim 16, wherein the single language includes English, Korean, Japanese, Arabic, Sanskrit and Latin.
19. The apparatus as recited in claim 16, wherein the bilingual includes EnglishJapanese, KoreanJapanese, Englishanother country language, Latinanother language, Sanskritanother language and Arabicanother language.
20. The apparatus as recited in claim 6, wherein a size of each of the caption display units is made up of N numbers of characters per line and M numbers of lines where the N and the M are positive integers, respectively :.
21. The apparatus as recited in claim 20, wherein the M is 3.
22. The apparatus as recited in claim 20, wherein the N and the M are 16 and 4, respectively.
23. The apparatus as recited in claim 2, wherein the caption data file includes: a header; the caption data; and the synchronous frame information.
24. The apparatus as recited in claim 23, wherein the header includes mode information, identification information relating to a digital audio and caption player, version information relating to the digital audio and caption player, the number of characters per line displayed on a liquid crystal display (LCD), the number of lines displayed on the LCD, offset information, frame information of the digital audio data, the number of caption display units, language information, version information relating to the digital audio data, layer information of the digital audio data, index information relating to bit rate of the digital audio data, advertisement start point information, caption start point information, and size information of a caption data file.
25. A method for generating caption data corresponding to digital audio data having a plurality of frames, comprising the steps of: a) generating the caption data in response to a user instruction; and b) sorting the caption data corresponding to the digital audio data by caption display units.
26. The method as recited in claim 25, further comprising the step of: c) storing the digital audio data in the form of a digital audio data file.
27. The method as recited in claim 26, further comprising the steps of: d) generating a caption data file having synchronous frame information by designating the synchronous frame information to the caption data sorted by the caption display units; and e) checking whether the synchronous frame information should be corrected in response to the user instruction; and correcting the synchronous frame information designated by the caption display units.
28. The method as recited in claim 25, wherein said step a) further includes the step of: setting a size of each of the caption display units on the user interface screen in response to the user instruction.
29. The method as recited in claim 27, wherein said step e) further includes: combining the caption data file having the synchronous frame information with the digital audio data file.
30. The method as recited in claim 25, wherein said step b) further includes the steps of: checking whether any caption data, which has a broken word, exists among the caption data sorted by the caption display unit to edit the caption data by the caption display units; and displaying the edited caption data on a user interface screen.
31. The method as recited in claim 27, wherein said step d) includes: dl) sequentially selecting all the caption data by the caption display units; d2) playing back the stored digital audio data in response to the user instruction; d3) visually displaying a playback output waveform, corresponding to the stored digital audio data being played back, on the user interface screen; d4) displaying the synchronous frame information, corresponding to the stored digital audio data being played back, on the user interface screen; and d5) displaying the caption data having the synchronous information on the user interface screen by designating the synchronous frame information to all the caption data by the caption display units.
32. The method as recited in claim 31, wherein said step d3) includes the step of: implementing a function of WHATYOUSEEISWHATYOUGET (WYSIWYG) by displaying the caption data which is the same as that displayed on a liquid crystal display (LCD) contained in a digital audio and caption player.
33. The method as recited in claim 32, wherein said step e) includes the steps of: el) checking whether the designated synchronous frame information should be corrected in response to the user instruction; e2) if the designated synchronous frame information should be corrected, selecting a corresponding caption display unit; e3) moving a playback point to the caption data of the corresponding caption display unit selected; e4) playing back the digital audio data in synchronization with the captiondata of the corresponding caption display unit; and e5) correcting the designated synchronous frame information according to the digital audio data being playing back.
34. The method as recited in claim 27, wherein the synchronous frame information includes a synchronous frame number representing a start point of a frame contained in the digital audio data.
35. The method as recited in claim 32, wherein the user interface screen includes: a playback icon for playing back the digital audio data stored in response to the user instruction; a first display space for visually displaying a playback output waveform corresponding to the digital audio data being played back at present; and a second display space for displaying the synchronous frame information corresponding to the digital audio data being played back at present.
36. The method as recited in claim 32, wherein the user interface screen includes: a synchronous icon for designating the synchronous frame information in response to the user instruction; a start point selection icon for selecting a playback start point in response to the user instruction; an end point selection icon for selecting a playback end point in response to the user instruction; and a repeat icon for repeatedly playing back a predetermined section containing a predetermined number of caption display units in response to the user instruction.
37. The method as recited in claim 32, wherein the user interface screen includes: a speed control bar for controlling a playback speed corresponding to the digital audio data in response to the user instruction; and a progressive bar for representing a playback point of the digital audio data.
38. The method as recited in claim 32, wherein the user interface screen further includes a window for inputting the caption data.
39. The method as recited in claim 38, wherein the window for inputting the caption data includes: a first display space for displaying information representing whether the caption data is displayed in a single language or a bilingual; a second display space for displaying a name of the single language or names of the bilingual; and a third display space for displaying the caption data inputted from a user interface device by the caption display unit.
40. The method as recited in claim 39, wherein the window for inputting the caption data further includes: a file open icon for opening a list of caption data files in response to the user instruction; and a storage icon for storing the caption data inputted into the third display space.
41. The method as recited in claim 39, wherein the single language includes English, Korean, Japanese, Arabic, Sanskrit and Latin.
42. The method as recited in claim 39, wherein the bilingual includes EnglishJapanese, KoreanJapanese, Englishanother language, Latinanother language, Sanskritanother language and Arabicanother language.
43. The method as recited in claim 25, wherein the digital audio data includes MP3 data, window media audio (WMA) data, and advanced audio coding (AAC) data.
44. The method as recited in claim 28, wherein the size of each of the caption display units is made up of N numbers of characters per line and M numbers of lines where the N and the M are positive integers, respectively.
45. The method as recited in claim 44, wherein the M is 3.
46. The method as recited in claim 44, wherein the N and the M are 16 and 4, respectively.
47. The method as recited in claim 27, wherein the caption data file includes: a header ; the caption data; and the synchronous frame information.
48. The method as recited in claim 47, wherein the header includes mode information, identification information relating to a digital audio and caption player, version information relating to the digital audio and caption player, the number of characters per line displayed on a liquid crystal display (LCD), the number of lines displayed on the LCD, offset information, frame information of the digital audio data, the number of caption display units, language information, version information relating to the digital audio data, layer information of the digital audio data, index information relating to bit rate of the digital audio data, advertisement start point information, caption start point information, and size information of a caption data file.
49. 4 9.
50. A computerreadable medium storing program instructions, the program instructions disposed on a computer to perform a method for generating caption data corresponding to digital audio data having a plurality of frames, comprising the steps of: a) generating the caption data in response to a user instruction; and b) sorting the caption data corresponding to the digital audio data by caption display units.
51. The computerreadable medium as recited in claim 49, further comprising the step of: c) storing the digital audio data in the form of a digital audio data file.
52. The computerreadable medium as recited in claim 50, further comprising the steps of: d) generating a caption data file having synchronous frame information by designating the synchronous frame information to the caption data sorted by the caption display units; and e) checking whether the synchronous frame information should be corrected in response to the user instruction; and correcting the synchronous frame information designated by the caption display units.
53. The computerreadable medium as recited in claim 49, wherein said step a) further includes the step of: setting a size of each of the caption display units on the user interface screen in response to the user instruction.
54. The computerreadable medium as recited in claim 51, wherein said step e) further includes the step of: combining the caption data file having the synchronous frame information with the digital audio data file.
55. The computerreadable medium as recited in claim 49, wherein said step b) further includes the steps of: checking whether any caption data, which has a broken word, exists among the caption data sorted by the caption display unit to edit the caption data by the caption display units; and displaying the edited caption data on a user interface screen.
56. The computerreadable medium as recited in claim 51, wherein said step d) includes the steps of: dl) sequentially selecting all the caption data by the caption display units; d2) playing back the stored digital audio data in response to the user instruction; d3) visually displaying a playback output waveform, corresponding to the stored digital audio data being played back, on the user interface screen; d4) displaying the synchronous frame information, corresponding to the stored digital audio data being played back, on the user interface screen; and d5) displaying the caption data having the synchronous information on the user interface screen by designating the synchronous frame information to all the caption data by the caption display units.
57. The computerreadable medium as recited in claim 55, wherein said step d3) includes the step of: implementing a function of WHATYOUSEEISWHATYOUGET (WYSIWYG) by displaying the caption data which is the same as that displayed on a liquid crystal display (LCD) contained in a digital audio and caption player.
58. The computerreadable medium as recited in claim 51, wherein said step e) includes the steps of: el) checking whether the designated synchronous frame information should be corrected in response to the user instruction; e2) if the designated synchronous frame information should be corrected, selecting a corresponding caption display unit; e3) moving a playback point to the caption data of the corresponding caption display unit selected; e4) playing back the digital audio data in synchronization with the caption data of the corresponding caption display unit; and e5) correcting the designated synchronous frame information according to the digital audio data being playing back.
59. A computerreadable medium storing program instructions, the program instructions disposed on a computer to perform a method for generating caption data corresponding to digital audio data having a plurality of frames, comprising the steps of: a) sending the digital audio data to a user interface device; and b) sending the caption data in synchronization with the digital audio data to the user interface device.
60. The computerreadable medium as recited in claim 58, wherein said step a) includes the step of: sending the digital audio data to a speaker contained in a digital audio and caption player.
61. The computerreadable medium as recited in claim 58, wherein said step b) includes the step of: sending the caption data in synchronization with the digital audio data to a liquid crystal display (LCD) contained in a digital audio and caption player.
Description:
APPARATUS AND METHOD FOR GENERATING CAPTION DATA Description Technical Fleld The present invention relates to an apparatus and method for generating caption data ; and, more particularly, to an apparatus and method for generating caption data in synchronization with digital audio data, and a computer-readable medium storing program instructions, the program instructions disposed on a computer to perform a method for generating the caption data in synchronization with the digital audio data.

Background Art Generally, a digital audio data file contains only sound and music information as digital audio data. However, by using software of a digital audio player based on a personal computer, a user may also store another information as well as the sound and music information in the digital audio data file, which can be defined by the user. An MP3 data file as the digital audio data file includes a header portion, a main data portion and an auxiliary data portion. Caption data corresponding to the digital audio data is stored in the auxiliary data portion contained in the MP3 data file. The digital audio data configured in the form of an MP3 data format is made by audio access units (AAUs). An AAU is a minimum frame unit for the sound and music information of the MP3 data. The digital audio data having a plurality of frames includes an audio part of a standard for digital versatile disk (DVD) and moving picture coding experts group (MPEG) as well as the MP3 data format.

The DVD includes a DVD-video, a DVD-ROM, a DVD-R, a DVD-RW, a DVD-RAM, a DVD-audio, a DVDX, etc. All the digital audio data stored in the DVD should be based on an MPEG-2 audio standard or an AC-3 audio standard of Dolby.

Referring to Fig. 1, there is shown an exemplary view illustrating caption data having synchronous frame numbers. As

shown, where a conventional digital audio player based on a personal computer records following digital audio data frames of san-nyong-ha-shim-mi-ca ! ju-shik-kwae-sa Korea Media im- mi-da" (Korean) or"Hello! This is Korea Media", about 1000 digital audio frames may be needed. Further, where the conventional digital audio player based on the personal computer plays back the digital audio data frames of'an-nyong-ha- shim-mi-ca! ju-shik-kwae-sa Korea Media im-mi-da" (Korean). or "Hello! This is Korea Media", it may take about 4 seconds.

Furthermore, the conventional digital audio player based on the personal computer designates a synchronous frame number of the digital audio data every word of the caption data to display the caption data in synchronization with the digital audio data on the monitor of the personal computer. In other words, where the conventional digital audio player plays back the digital audio data which starts at frame 0", the caption data of "an-nyong-ha-shim-mi-ca!" (Korean) or mHello !"is displayed on the monitor of the personal computer. Thereafter, where the conventional digital audio player plays back the digital audio data, which starts at frame"400", the caption data of"ju- shik-kwae-sa" (Korean) or"This"is displayed on the monitor of the personal computer.

However, the conventional digital audio player based on the personal computer could not easily and exactly designate the frame numbers of the digital audio data by caption display units.

There was a problem that it could not exactly display the caption data in synchronization with the digital audio data on the monitor of the personal computer. Further, the conventional digital audio player based on the personal computer could not edit the caption data in synchronization with the digital audio data by the caption display units.

Dlsclosure of Invention It is, therefore, an object of the present invention to provide an apparatus and method for generating caption data which can sort and edit the caption data in synchronization with digital

audio data by caption display units and a computer-readable medium storing program instructions, the program instructions disposed on a computer to perform the method.

It is, therefore, another object of the present invention to provide an apparatus and method for generating caption data which can designate exact synchronous frame numbers to the caption data in synchronization with digital audio data and a computer-readable medium storing program instructions, the program instructions disposed on a computer to perform the method.

It is further another object of the present invention to provide an apparatus and method for generatina caption data which can implement a function of WHAT-YOU-SEE-IS-WHAT-YOU-GET (WYSIWYG) by making the caption data displayed on a liquid crystal display (LCD) which is the same as that displayed on a user interface screen based on a personal computer and a computer-readable medium storing program instructions, the program instructions disposed on a computer to perform the method.

It is still further another object of the present invention to provide an apparatus and method for generating caption data which can be employed in music listening, language learning and religious education and a computer-readable medium storing program instructions, the program instructions disposed on a computer to perform the method.

In accordance with an aspect of the present invention, there is provided an apparatus for generating caption data corresponding to digital audio data having a plurality of frames, comprising: a caption data generation means for generating the caption data in response to a user instruction; and a sort means for sorting the caption data, generated by said caption data generation means, corresponding to the digital audio data by caption display units.

In accordance with another aspect of the present invention, there is provided a method for generating caption data corresponding to digital audio data having a plurality of frames,

comprising the steps of: a) generating the caption data in response to a user instruction; and b) sorting the caption data corresponding to the digital audio data by caption display units.

In accordance with further another aspect of the present invention, there is provided a computer-readable medium storing program instructions, the program instructions disposed on a computer to perform a method for generating caption data corresponding to digital audio data having a plurality of frames, comprising the steps of: a) generating the caption data in response to a user instruction; and b) sorting the caption data corresponding to the digital audio data by caption display units.

In accordance with still further another aspect of the present invention, there is provided a computer-readable medium storing program instructions, the program instructions disposed on a computer to perform a method for generating caption data corresponding to digital audio data having a plurality of frames, comprising the steps of: a) sending the digital audio data to a user interface device; and b) sending the caption data in synchronization with the digital audio data to the user interface device.

Brief Description of the Drawings The above and other objects and features of the instant invention will become apparent from the following description of preferred embodiments taken in conjunction with the accompanying drawings, in which: Fig. 1 is an exemplary diagram depicting caption data having synchronous frame numbers ; Fig. 2 is a block diagram illustrating an apparatus for generating a digital audio data file combined with a caption data file in accordance with the present invention; Figs. 3A and 3B are flowcharts illustrating a method for generating a digital audio data file combined with a caption data file in accordance with the present invention ; Fig. 4 is an exemplary view depicting a user interface screen in accordance with the present invention;

Fig. 5 is an exemplary view illustrating an input window of caption data generated by selecting a caption input icon located on the user interface screen shown in Fig. 4; Fig. 6 is an exemplary view describing a list of caption data files generated by selecting a file open icon located on the input window of caption data shown in Fig. 5; Fig 7 is an exemplary view illustrating an input window of caption data with the caption data generated by selecting a caption data file from a list of caption data files shown in Fig.

6; Fig. 8 is an exemplary view illustrating a user interface screen generated by selecting an OK icon located on the input window of caption data shown in Fig. 7; Fig. 9 is an exemplary view describing a list of digital audio data files generated by selecting a file open icon located on the user interface screen shown in Fig. 8; Fig. 10 is an exemplary view showing a user interface screen generated by selecting a digital audio data file from a list of digital audio data files shown in Fig. 9; and Figs. 11A and 11B are exemplary views depicting synchronous frame numbers generated by selecting a synchronization icon located on the user interface screen shown in Fig. 10.

Best Mode for Carrying out the Invention Referring to Fig. 2, there is shown a block diagram illustrating an apparatus for generating a digital audio data file combined with a caption data file in accordance with the present invention. As shown, the apparatus includes a user interface block 102, a caption storage block 104, a display unit generation block 106, a caption edition block 108, a caption synchronization block 110, an audio storage block 112, an audio playback block 114, and a combination block 116.

The user interface block 102 receives a user instruction outputted from a user 100. Herein, the user interface block 102 includes a monitor of a personal computer (PC) or a liquid crystal display (LCD) having a preset size, a keyboard, a mouse, a speaker,

etc.

The caption storage block 104 stores caption data in response to the user instruction inputted from the user interface block 102. In this case, the caption storage block 104 may be implemented as a random access memory (RAM) or a hard disc.

The display unit generation block 106 sorts the caption data by caption display units on the user interface screen. Further, the display unit generation block 106 checks whether any caption data of a caption display unit exists which requires any correction of a synchronous frame number to select the caption data of a corresponding caption display unit.

The display unit generation block 106 implements a function of WHAT-YOU-SEE-IS-WHAT-YOU-GET (WYSIWYG) by making the caption data displayed on the LCD of a digital audio and caption player (not shown), which is the same as that displayed on the user interface screen based on the personal computer.

The caption edition block 108 checks whether any caption data having a word broken in the middle exists to edit the caption data.

The audio storage block 112 stores digital audio data having a plurality of frames in the form of the digital audio data file.

Herein, the digital audio data includes MP3 data, window media audio (WMA) data, or advanced audio coding (AAC) data.

The audio playback block 114 plays back the digital audio data in synchronization with the caption data of the caption display unit selected on the display unit generation block 106.

The caption synchronization block 110 sequentially designates the synchronous frame numbers to the caption data of all the caption display units, where the caption data of the caption display units is sequentially selected. Thereafter, the <BR> <BR> <BR> <BR> caption synchronization block 110 generates the caption data file having the synchronous frame numbers.

Table 1 Header portion Caption data Synchronous frame portion Information portion Table 2 Field length Mode information (text mode or graphic mode) 2 Identification information relating to digital audio 2 and caption player Version information relating to the digital audio and 2 caption player Information relating to the number of characters (the 2 number of characters per line displayed on LCD in case of text mode, or the number of dots per line displayed on LCD in case of graphic mode) Information relating to the number of lines (the 2 number of lines displayed on LCD in case of text mode or graphic mode) Offset information 2 Frame size information of MP1, MP2 or MP3 2 The number of caption display units 2 Language information 8 Version information of MP1, MP2 or MP3 2 Layer information of MP1, MP2 or MP3 2 Index information relating to bit rate of digital 2 audio data Reserved 22 Advertisement start location information 4 Caption start location information 4 Size information of caption data file 4 Table 1 shows portions contained in the caption data file, and Table 2 shows fields contained in the header portion of the

caption data file. The header portion of the caption data file is made up of 64 bytes.

Further, the caption synchronization block 110 moves a playback point to the corresponding caption display unit in response to the user instruction inputted from the user interface block 102. Thereafter, the caption synchronization block 110 corrects a synchronous frame number designated to the caption data of the corresponding caption display unit in response to the user instruction inputted through a synchronization icon located on a user interface screen.

The combination block 116 combines the caption data file having the synchronous frame numbers outputted from the caption synchronization block 110 with the digital audio data file outputted from the audio storage block 112, to generate the digital audio data file combined with the caption data file having the synchronous frame numbers. The digital audio and caption <BR> <BR> <BR> player having the LCD may be coupled to the apparatus in accordance with the present invention. At this time, the digital audio and caption player can exactly display the caption data in synchronization with the digital audio data, where the digital audio and, caption player plays back. the digital audio data file combined with the caption data file having the synchronous frame numbers generated by the apparatus in accordance with the present invention.

Referring to Figs 3A and 3b, there are shown flowcharts illustrating a method for generating a digital audio data file combined with a caption data file in accordance with the present invention.

As shown, at step S302, a size of each of caption display units may be set in response to a user instruction inputted from a user interface block 102 shown in Fig. 2. The size of the caption display unit is made up of N numbers of characters per line and M numbers of lines where the N and the M are positive integers, respectively. For example, the N is 16 and the M is 4.

At step S304, a caption storage block 104 shown in Fig. 2 checks whether the caption data has been stored in the caption

storage block.

At step S306, if the caption data has not been stored, the caption storage block 104 stores the caption data inputted from the user interface block 102 in an input window of the caption data.

At step S308, a display unit generation block 106 shown in Fig. 2 sorts the caption data by the caption display units. The display unit generation block 106 implements a function of <BR> <BR> <BR> WYSIWYG by making the caption data displayed on an LCD of a digital audio and caption player, which is the same as that displayed on the user interface screen based on the personal computer.

At step S310, a caption edition block 108 shown in Fig. 2 checks whether any caption data having a broken word in the middle exists.

At step S312, if any caption data having the broken word in the middle exists, the caption edition block 108 edits the caption data in response to the user instruction.

At step S314, the display unit generation block 106 displays a list of the caption display units on the user interface screen.

At step S316, an audio storage block 112 shown in Fig. 2 checks whether the digital audio data file has been stored.

At step S318, if the digital audio data file has not been stored, the audio storage block 112 stores the digital audio data having a plurality of frames in the form of the digital audio data file.

At step S320, the display unit generation block 106 sequentially selects the caption data of all the caption display units.

At step S322, an audio playback block 114 shown in Fig. 2 plays back the digital audio data in synchronization with the caption data of the caption display unit selected.

At step S324, a caption synchronization block 110 shown in Fig. 2 sequentially designates synchronous frame numbers to the caption data of all the caption display units.

At step S326, the caption synchronization block 110 checks whether any caption data of the caption display unit exists which

requires a correction of a synchronous frame number.

At step S328, if any caption data of the caption display unit exists which requires the correction of the synchronous frame number, the caption synchronization block selects the corresponding caption display unit in response to the user instruction.

At step S330, the caption synchronization block 110 moves a playback point to the caption data of the corresponding caption display unit selected and moves a progressive bar located on the user interface screen.

At step S332, an audio playback block 114 shown in Fig. 2 plays back the digital audio data in synchronization with the caption data of the corresponding caption display unit at the playback point moved by the caption synchronization block.

At step S334, the caption synchronization block 110 corrects the synchronous frame number designated to the caption data of the corresponding caption display unit according to the digital audio data in response to the user instruction inputted through a synchronous icon located on the user interface screen.

At step S335, the caption synchronization block 110 generates the caption data file having the synchronous frame numbers.

At step S336, a combination block 116 shown in Fig. 2 combines the caption data file having the synchronous frame numbers with the digital audio data file.

At step S338, the combination block generates the digital audio data file combined with the caption data file having the synchronous frame numbers.

The method for generating the digital audio data file combined with the caption data file in accordance with the present invention may be stored in a computer-readable medium such as an optical disk or a hard disk, etc.

Referring to Fig. 4, there is shown an exemplary view depicting a user interface screen, which is employed in the present invention. As shown, the user interface screen includes display spaces 400,401,402,403, and 414, a progressive bar

404, a speed control bar 405, a synchronization icon 406, a playback icon 407, a size change icon 408, a caption input icon 409, a caption storage icon 410, a start-point selection icon 411, a playback end-point selection icon 412, a repeat selection block 413.

The display space 400 is a space for displaying a synchronous frame number of digital audio data being played back at present.

The display space 410 is a space for displaying the caption data of a selected caption display unit in response to a user instruction. The display space 402 is a space for visually displaying a playback waveform of the digital audio data. The display space 403 is a space for displaying the caption data by caption display units and synchronous frame numbers. Herein, a function of WYSIWYG is implemented by making the caption data displayed on the display space 401, which is the same as that displayed on an LCD of a digital audio and caption player.

Information relating to a font contained in the caption data is displayed on the display space 414 in response to the user instruction.

The progressive bar 404 indicates a playback point. Where any caption display unit is selected, the progressive bar 404 moves the playback point to the caption display unit selected. <BR> <BR> <BR> <P> The speed control bar 405 controls a playback speed of the digital audio data in response to the user instruction.

The synchronization icon 406 is selected to designate a synchronous frame number to the caption data of the caption display unit selected in response to the user instruction. The playback icon 407 is selected to play back the digital audio data in synchronization with the caption data of the caption display unit selected in response to the user instruction. The size change icon 408 is selected to change the size of the LCD for displaying caption data of the caption display unit in response to the user instruction. The caption input icon 409 is selected to open an input window for inputting the caption data in response to the user instruction. The caption storage icon 410 is selected to store the caption data in response to the user instruction.

The playback start-point selection icon 411 is selected to select a start point of playback in response to the user instruction. The playback end-point selection icon 412 is selected to select an end point of playback in response to the user instruction. The repeat selection icon 413 is selected to select a section containing a predetermined number of caption display units to be repeatedly played back in response to the user instruction.

Referring to Fig. 5, there is shown an exemplary view illustrating an input window of caption data generated by selecting a caption input icon on the user interface screen shown in Fig. 4. The input window of the caption data includes display spaces 500,501 and 508, a file open icon 502 and a save icon 503.

The display space 508 is a space for displaying information representing whether the caption data is displayed in a single language or a bilingual. The display space 500 is a space for displaying a name of the single language or names of the bilingual.

The single language includes English, Korean, Japanese, Latin, Arabic, Sanskrit, etc. The bilingual includes English/Japanese, Korean/Japanese, English/another language, etc.

The display space 501 is a space for displaying the caption data inputted from a user interface block 102 shown in Fig. 2.

The file open icon 502 is selected to open a list of caption data files in response to a user instruction. The save icon 503 is selected to save the caption data inputted on the display space 501 in response to the user instruction.

Referring to Fig. 6, there is shown an exemplary view showing a list of caption data files generated by selecting a file open icon located on the input window of caption data shown in Fig.

5. As shown, the list of caption data files includes a caption data file of ^Caption 1. Txt".

Referring to Fig. 7, there is an exemplary view illustrating an input window of caption data with the caption data generated by selecting a caption data file from the list of caption data files shown in Fig. 6. As shown, the display space 501 displays

English caption data of 16 characters per line by 4 lines per caption display unit. The OK icon 504 is selected to generate the user interface screen having English caption data displayed on the display space 501 in response to a user instruction.

Referring to Fig. 8, there is shown an exemplary view depicting a user interface screen generated by selecting an OK icon on the input window of caption data shown in Fig. 7.

Referring to Fig. 9, there is shown an exemplary view illustrating a list of digital audio data files generated by selecting a file open icon on the user interface screen shown in Fig. 8. As shown, the list of digital audio data files includes a digital audio data file of"Caption.mp3" 901.

Referring to Fig. 10, there is shown an exemplary view illustrating a user interface screen generated by selecting a digital audio data file from the list of digital audio data files shown in Fig. 9. As shown, the user interface screen displays a name of the digital audio data file selected on the display space 401. Where the user interface screen selects the caption data 1000 of a caption display unit in response to a user instruction outputted from a user interface block 102 shown in Fig. 2, the display space 401 displays the selected caption data 1000. The display space 400 displays'point 126 frame"where a 126th frame of the digital audio data is being played back at present. The display space 402 displays a playback output waveform of the digital audio data being played back at present.

Referring to Figs. 11A and 11B, there are shown exemplary views showing a synchronous frame number generated by selecting a synchronization icon on the user interface screen shown in Fig.

10. As shown, the synchronization icon is selected to designate a synchronous frame number on the user interface screen in response to a user instruction. Herein, the synchronous frame number represents a start point of a frame of digital audio data being played back.

Reference numeral ^1001"represents a designated synchronous frame number. The designated synchronous frame number is displayed on the left part of a display space 403. A

speed control bar 405 controls a playback speed of digital audio data in response to a user instruction. Where the designated synchronous frame number should be corrected, the user interface screen may correct the synchronous frame number by using the speed control bar 405 or a synchronization icon 406 in response to the user instruction.

Although the preferred embodiments of the invention have been disclosed for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims.