Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CODING AND TRANSMISSION OF MULTIPLE WEB PAGES
Document Type and Number:
WIPO Patent Application WO/2001/050298
Kind Code:
A2
Abstract:
A method of providing information, comprising: providing a plurality of files including descriptions of a plurality of respective Web pages; selecting at least a sub-group of the Web pages; creating a combined file which includes descriptions of the Web pages in the sub-group; and transmitting the combined file to a client responsive to a request for one of the selected Web pages received from the client.

Inventors:
KESELMAN ALEX (IL)
NOV ISRAEL (IL)
Application Number:
PCT/IL2000/000721
Publication Date:
July 12, 2001
Filing Date:
November 05, 2000
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
WEBLINK LTD (IL)
KESELMAN ALEX (IL)
NOV ISRAEL (IL)
International Classes:
G06F17/30; H04L29/08; (IPC1-7): G06F17/00
Foreign References:
US5991713A1999-11-23
US5802520A1998-09-01
Other References:
CARD S K ET AL: "THE WEBBOOK AND THE WEB FORAGER: AN INFORMATION WORKSPACE FOR THE WORLD-WIDE WEB" COMMON GROUND. CHI '96 CONFERENCE PROCEEDINGS. CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS. VANCOUVER, APRIL 13 - 18, 1996, CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, NEW YORK, ACM, US, 13 April 1996 (1996-04-13), pages 111-117, XP000657809 ISBN: 0-201-94687-4
PIROLLI P ET AL: "Silk from a Sow's Ear: Extracting Usable Structures from the Web" XEROX RESEARCH CENTER, 11 July 1996 (1996-07-11), XP002128179
Attorney, Agent or Firm:
Fenster, Paul (LTD. P. O. Box 49002 Petach Tikva, IL)
Download PDF:
Claims:
CLAIMS
1. A method of providing information, comprising: providing a plurality of files including descriptions of a plurality of respective Web pages; selecting at least a subgroup of the Web pages; creating a combined file which includes descriptions of the Web pages in the sub group; and transmitting the combined file to a client responsive to a request for one of the selected Web pages received from the client.
2. A method according to claim 1, wherein providing the plurality of files comprises providing files describing Web pages included in a single Web site.
3. A method according to claim 1 or claim 2, wherein providing the plurality of files comprises providing at least links to files that are the results of a search.
4. A method according to any of claims 13, wherein selecting the subgroup comprises selecting responsive to a map of the interconnections of the plurality of Web pages.
5. A method according to any of claims 14, wherein selecting the subgroup comprises selecting responsive to statistics of the usage of the plurality of Web pages.
6. A method according to any of claims 15, wherein selecting the subgroup comprises selecting responsive to a user profile.
7. A method according to any of claims 16, wherein selecting the subgroup comprises selecting responsive to a bandwidth of a link on which the file is transmitted.
8. A method according to any of claims 17, wherein creating the combined file comprises creating a combined file in which at least some of the descriptions of the Web pages are compressed.
9. A method according to any of claims 18, wherein creating the combined file comprises replacing at least one of the hypertext links in one or more of the selected pages with a script which actuates the display of the page referenced by the link.
10. A method according to claims 9, wherein replacing at least one of the hypertext links comprises replacing links which lead to one of the selected pages with a script which actuates the display of the page referenced by the link from within the combined file.
11. A method according to any of claims 110, wherein creating the combined file comprises replacing links to pages included in one or more other combined files with links which indicate the location of the referenced page within the other combined file.
12. A method according to any of claims 111, wherein creating the combined file is performed responsive to receiving the request from the client.
13. A method according to any of claims 111, wherein creating the combined file is performed independently of said request.
14. A method according to any of claims 113, wherein creating the combined file comprises detecting repeated embedded objects between the plurality of pages.
15. A method according to claim 14, wherein creating the combined file comprises detecting repeated embedded objects between the selected pages.
16. A method according to claim 15, comprising providing only one copy of said repeated object in said file.
17. A method according to any of claims 1415, comprising providing only one copy of said repeated object as a separate file.
18. A method according to any of claims 117, wherein said request is an HTTP request.
19. A method according to any of claims 118, wherein transmission of the combined instead of a regular file, is transparent to a user that generate said request.
20. A method according to any of claims 119, wherein selecting at least a subgroup comprises selecting fewer than all the plurality of files.
21. A method according to any of claims 120, comprising maintaining a copy of said files on a file server associated with a storage of said combined file.
22. A method of providing information, comprising: receiving a request for a Web page, including one or more links to data elements, from a client; and transmitting to the client, in response to said request, a combined file including descriptions of the requested page and one or more of the data elements referenced by the one or more links of the Web page.
23. A method according to claim 22, comprising generating the combined file responsive to receiving the request from the client.
24. A method according to claim 22 or claim 23, wherein the combined file is generated before receiving the request from the client.
25. A method according to any of claims 2224, wherein the one or more of the data elements referenced by the links of the Web page comprise at least one additional Web page.
26. A method according to any of claims 2224, wherein the one or more of the data elements referenced by the links of the Web page comprise embedded objects.
27. Apparatus for web page serving, comprising: a compression unit that provides at least one combined file including the description of a plurality of WWW pages; and a web server that receives requests for WWW pages and responds with at least one of said combined files.
28. Apparatus according to claim 27, wherein said compression unit generates said file responsive to said request.
29. Apparatus according to claim 27 or claim 28, wherein said compression unit generates said file to be personalized for a particular user.
30. Apparatus according to claim 27, wherein said compression unit generates said file responsive to a request by a WWW site manager.
31. Apparatus according to claim 27 or claim 30, wherein said compression unit maintains a copy of uncompressed versions of said WWW pages.
32. Apparatus according to any of claims 2731, wherein said compression unit comprises a grouper that selectively groups pages together based, at least, on their link structure.
33. Apparatus according to any of claims 2732, wherein said compression unit comprises a redundancy detector that detects embedded elements repeated between said pages.
34. Apparatus according to any of claims 2733, wherein said compression unit is integrated with a WWW site construction program.
35. Apparatus according to any of claims 2733, wherein said compression unit is integrated with a WWW site maintaining program.
Description:
CODING AND TRANSMISSION OF MULTIPLE WEB PAGES FIELD OF THE INVENTION The present invention relates to communication networks and in particular to transmission of Web pages.

BACKGROUND OF THE INVENTION Web servers are commonly used to provide users with information. Generally, the information is provided in the form of Web pages. Some Web pages are transmitted in the form of a single HTTP (Hypertext transfer protocol) file. Other Web pages, for example Web pages that include images or other embedded elements, are transmitted as an HTTP file which includes an HTML page and a plurality of additional files (referred to also as objects) referenced by the HTML page. When a client requests such Web pages from a server, in a first stage the HTTP file is transmitted to the client. When the client opens the HTML page it finds the references to the additional objects and sends additional HTTP requests to the server to receive these objects. Many Web sites include a plurality of Web pages that are interconnected using hyper-text links. That is, Web pages usually include areas which when clicked upon initiate the retrieval and display of other Web pages, often from the same site.

The transmission of Web pages that include images typically requires large amounts of bandwidth. When clients are connected through low bandwidth links, as most users are, such Web pages require relatively long transmission times, which annoy users. One of the features required from Web sites in order to attract clients, is fast response times.

In an early version of the HTTP protocol, each HTTP request message is transmitted on a separate TCP connection to the server. The server sends the HTTP response message on the TCP connection on which the request message was received and then closes the TCP connection. A newer HTTP version (i. e., HTTPvl. l) optionally uses the same TCP connection for all the HTTP messages transmitted between the client and the server. Such connections are referred to as persistent HTTP connections. A single TCP connection may thus carry a stream of HTTP request messages from the client to the server. The time required for establishing the TCP connections is reduced using this scheme. Still, many clients and/or Web servers, for example due to load balancing limitations, do not support persistent HTTP connections.

The establishment of TCP connections involves transmission of three packets in a hand shake procedure. The establishment of a connection for each retrieved Web page and each embedded element is therefore time consuming.

SUMMARY OF THE INVENTION An aspect of some embodiments of the present invention relates to a method of storing and/or providing information, in which a plurality of Web pages and/or web page elements are transmitted as a single combined file, for example as an HTTP file. In some embodiments of the invention, the HTTP file comprises the HTML description of a master page, in a regular HTML format recognized by existing clients, and one or more additional pages in a compressed HTML format. Optionally, at least some of the hypertext links in the master HTML page, and optionally in the rest of the pages, are replaced, in some embodiments of the invention, by scripts, e. g., Java scripts, which when clicked upon display the respective page from the data stored in the HTTP file.

Alternatively or additionally, all the transmitted Web pages are stored in the combined file in a compressed format. A dummy master page is stored together with the compressed pages in the file. The dummy master page includes a Java script that initiates the display of one of the pages upon reception of the file by the client.

In an exemplary embodiment of the invention, files that are repeated between pages are provided only once in a compressed file that includes the pages.

In some embodiments of the invention, the Web pages to be included in a single combined file are selected responsive to an inter-link map of a compressed Web site (or any other group of Web pages being compressed). Alternatively or additionally, the Web pages included in a single combined file are determined responsive to statistics of the usage of the compressed pages and/or of the hypertext links connecting the pages. Alternatively or additionally, the Web pages included in a single combined file are determined responsive to a user profile of the user receiving the Web pages and/or of the bandwidth of the connection between the client and the server.

In some embodiments of the invention, the combined files of a Web site are prepared during and/or after the preparation of the Web site and the Web site is posted on a server as the combined file. Alternatively or additionally, the compression is performed on the fly responsive to the download requests from a client.

There is thus provided in accordance with an exemplary embodiment of the invention, a method of providing information, comprising: providing a plurality of files including descriptions of a plurality of respective Web pages; selecting at least a sub-group of the Web pages;

creating a combined file which includes descriptions of the Web pages in the sub- group; and transmitting the combined file to a client responsive to a request for one of the selected Web pages received from the client. Optionally, providing the plurality of files comprises providing files describing Web pages included in a single Web site. Alternatively or additionally, providing the plurality of files comprises providing at least links to files that are the results of a search. Alternatively or additionally, selecting the sub-group comprises selecting responsive to a map of the interconnections of the plurality of Web pages.

Alternatively or additionally, selecting the sub-group comprises selecting responsive to statistics of the usage of the plurality of Web pages. Alternatively or additionally, selecting the sub-group comprises selecting responsive to a user profile. Alternatively or additionally, selecting the sub-group comprises selecting responsive to a bandwidth of a link on which the file is transmitted. Alternatively or additionally, creating the combined file comprises creating a combined file in which at least some of the descriptions of the Web pages are compressed.

Alternatively or additionally, creating the combined file comprises replacing at least one of the hypertext links in one or more of the selected pages with a script which actuates the display of the page referenced by the link. Optionally, replacing at least one of the hypertext links comprises replacing links that lead to one of the selected pages with a script which actuates the display of the page referenced by the link from within the combined file.

In an exemplary embodiment of the invention, creating the combined file comprises replacing links to pages included in one or more other combined files with links which indicate the location of the referenced page within the other combined file. Alternatively or additionally, creating the combined file is performed responsive to receiving the request from the client.

In an exemplary embodiment of the invention, creating the combined file is performed independently of said request. Alternatively or additionally, creating the combined file comprises detecting repeated embedded objects between the plurality of pages. Optionally, creating the combined file comprises detecting repeated embedded objects between the selected pages. Optionally, the method comprises providing only one copy of said repeated object in said file.

In an exemplary embodiment of the invention, the method comprises providing only one copy of said repeated object as a separate file.

In an exemplary embodiment of the invention, said request is an HTTP request.

Alternatively or additionally, transmission of the combined instead of a regular file, is transparent to a user that generate said request.

In an exemplary embodiment of the invention, selecting at least a sub-group comprises selecting fewer than all the plurality of files. Alternatively or additionally, the method comprises maintaining a copy of said files on a file server associated with a storage of said combined file.

There is also provided in accordance with an exemplary embodiment of the invention, a method of providing information, comprising: receiving a request for a Web page, including one or more links to data elements, from a client; and transmitting to the client, in response to said request, a combined file including descriptions of the requested page and one or more of the data elements referenced by the one or more links of the Web page. Optionally, the method comprises generating the combined file responsive to receiving the request from the client. Alternatively or additionally, the combined file is generated before receiving the request from the client. Alternatively or additionally, the one or more of the data elements referenced by the links of the Web page comprise at least one additional Web page. Alternatively, the one or more of the data elements referenced by the links of the Web page comprise embedded objects.

There is also provided in accordance with an exemplary embodiment of the invention, apparatus for web page serving, comprising: a compression unit that provides at least one combined file including the description of a plurality of WWW pages; and a web server that receives requests for WWW pages and responds with at least one of said combined files. Optionally, said compression unit generates said file responsive to said request. Alternatively or additionally, said compression unit generates said file to be personalized for a particular user.

In an exemplary embodiment of the invention, said compression unit generates said file responsive to a request by a WWW site manager.

In an exemplary embodiment of the invention, said compression unit maintains a copy of uncompressed versions of said WWW pages. Alternatively or additionally, said compression unit comprises a grouper that selectively groups pages together based, at least, on their link structure. Alternatively or additionally, said compression unit comprises a

redundancy detector that detects embedded elements repeated between said pages.

Alternatively or additionally, said compression unit is integrated with a WWW site construction program. Alternatively or additionally, said compression unit is integrated with a WWW site maintaining program.

BRIEF DESCRIPTION OF FIGURES Particular non-limiting embodiments of the invention will be described with reference to the following description of embodiments in conjunction with the figures. Identical structures, elements or parts which appear in more than one figure are preferably labeled with a same or similar number in all the figures in which they appear, in which: Fig. 1 is a schematic block diagram of a Web site preparation system, in accordance with an exemplary embodiment of the present invention; Fig. 2 is a flowchart of the acts of a compression unit in compressing a Web site, in accordance with an exemplary embodiment of the present invention; Fig. 3 is a schematic block diagram of a structure of a file including a plurality of web pages stored as a single page, in accordance with an embodiment of the present invention; Fig. 4 is a flowchart of the acts performed in downloading pages of a Web site, in accordance with an exemplary embodiment of the present invention; and Fig. 5 is a schematic illustration of an exemplary simplified Web site and optional file organizations therefor, in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS Fig. 1 is a schematic block diagram of a Web site preparation system 20, in accordance with an embodiment of the present invention. A Web site preparation computer 22, optionally a general purpose computer with suitable software, is used to generate a Web site. The Web site is then posted on a Web server 24 that provides the Web pages to clients. Alternatively or additionally, Web pages can be prepared directly on Web server 24.

In some embodiments of the invention, compression unit 26 is situated between preparation computer 22 and Web server 24. When a generated Web site is transferred from preparation computer 22 to Web server 24, compression unit 26 compresses the pages as described below and passes the Web site in its compressed form to Web server 24. Optionally, when a Web master retrieves the Web site from Web server 24 to preparation computer 22 in order to perform changes and/or add pages to the Web site, compression unit 26 decompresses the Web site files and passes the decompressed files to computer 22. Alternatively or additionally, compression unit 26 keeps an uncompressed copy of the Web site, which is

provided to the Web master when changes are to be made. Thus, the Web master can perform changes in the compressed Web site without having knowledge of the compression tools and/or structures of the Web site. Optionally, the Web master is not aware of the compression performed by compression unit 26. Alternatively, the Web master uses the compressed Web site for performing further changes to the Web site.

In some embodiments of the invention, compression unit 26 stores a template of the compression and/or some of the parameters used in the compression, such that a subsequent compression after changes are performed, uses at least part of the results of previously performed compressions. Optionally, the template includes a map of the compressed Web site and/or other determined parameters of the Web site, as described below.

Alternatively or additionally to having compression unit 26 compress the Web site when it is passed from preparation computer 22 to Web server 24, compression unit 26 periodically and/or upon commands from the Web master compresses the Web site on Web server 24 and/or on preparation computer 22. Optionally, compression unit 26 uses statistics gathered by Web server 24 regarding the access to the different pages of the Web site, in performing the compression. In some embodiments of the invention, the compression is customized periodically according to the current statistics.

Alternatively or additionally, compression unit 26 is located between Web server 24 and a client 28. When a user enters the Web site, compression unit 26 compresses the Web site 26 and/or portions thereof on the fly, optionally according to a user profile of the client.

In some embodiments of the invention, compression unit 26 and Web server 24 are located on separate processors. Alternatively, compression unit 26 is a software located on Web server 24. Optionally, compression unit 26 is a plug-in unit, such as an MS Internet Information Server Filter, which cooperates with Web server 24. Alternatively or additionally to providing compression unit 26 in association with web server 24, it may be provided in other manners, for example, as a proxy or as a stand-alone service.

In some embodiments of the invention, client 28 comprises a standard Web browser and does not require any special software in order to carry out the invention. Alternatively, some clients may be optimized for use with client 28 so as to enhance the advantages of the present invention. In a particular example, client 28 may comprise a browser including software to decompress page files (as described below).

Fig. 2 is a flowchart of the acts of compression unit 26 in compressing a Web site, in accordance with an embodiment of the present invention. Optionally, compression unit 26

creates (50) a map of the Web site. In some embodiments of the invention, the map includes indication of the Web pages of the Web site and the hypertext links which lead between the Web pages. Optionally, the map also indicates for each page the embedded elements of the page (e. g., images) which are included in separate files, and the pages which refer to each of the embedded elements. The map is possibly created using any method, such as applying the BFS or DFS graph traversing methods to the"graph"defined by the hyperlinked files.

Alternatively or additionally, the map and/or portions thereof are imported by compression unit 26 from an external hardware or software unit. Alternatively or additionally, the maps are created by the preparation computer 22, for example, by the site preparation software.

In some embodiments of the invention, compression unit 26 finds (52) duplicate embedded elements and/or duplicated pages. In an exemplary embodiment of the invention, compression unit 26 reviews the embedded elements of the Web site, while preparing the map or separately, before or after, and prepares a short catalog which lists a few parameters (e. g., length, type and/or leading bits) of each of the embedded elements. Thereafter, elements which have identical parameter values are compared (e. g., bit by bit) to determine whether they are identical. In some embodiments of the invention, elements which are determined to be similar but not identical, are each split into two separate portion elements, one portion element which is identical in all the similar elements and one portion element which contains the non- identical portions, for each of the similar elements. A user alert may be generated in offline embodiments, to allow a user (e. g., site manager) to consolidate two files. Alternatively or additionally, an after the fact alert may be provided to a user, for example, in on-the-fly compression systems.

In some embodiments of the invention, compression unit 26 receives (54) statistical information on the visiting patterns of the Web site. The statistical information includes, for example, the number of visits in each page of the Web site, the frequencies of usage of the hypertext links of each of the pages and/or the frequencies of entrance to each of the pages from external and/or internal links.

In some embodiments of the invention, compression unit 26 groups (56) the Web pages into one or more page groups which are included in a single combined file. The grouping may depend, for example, on one or more of statistical considerations of intra and inter-group links and/or link following rates, on relative file sizes of each group, on a desire for certain pages to come up faster (e. g., smaller page files or different grouping) and/or on a sharing of embedded elements between pages.

Each page group is converted (58) into a single combined HTML page. The HTML descriptions of the pages in each group are stored together in a single respective combined file.

Hypertext links leading from one page of the group to another page of the group, are converted into Java scripts which actuate the display of the other page. In some embodiments of the invention, the display of the page does not change due to the replacement of the hypertext links by Java scripts.

In some embodiments of the invention, hypertext links which lead to other combined pages of the Web site are converted (60) into suitable links in accordance with the combining of the pages. Optionally, the hypertext link of a page is converted into a link that states the URL of the combined page with a parameter which states the position of the page in the combined page.

Optionally, compression unit 26 determines whether the embedded elements are in a compressed form and, if necessary, compresses (62) or re-compresses the elements. In some embodiments of the invention, different compression ratios and/or compression methods are used for different embedded elements and/or in different compression instances. For example, the compression ratio may be adjusted according to user preferences (i. e., whether higher quality or faster service is desired), the bandwidth of the clients connection and/or the size of a specific combined file.

Fig. 3 is a schematic block diagram of the structure of a combined file 70, in accordance with an embodiment of the present invention. In some embodiments of the invention, combined file 70 comprises a master page record 72 that is automatically displayed by the client when the packet is downloaded. In addition, combined file 70 comprises one or more slave page records 74 that describe additional pages, which are not generally displayed immediately when combined file 70 is downloaded, but rather responsive to actuation of a Java script in one of the other pages included in combined file 70.

Optionally, master page record 72 comprises an open HTML description of one of the original pages of the group. Alternatively, master page record 72 describes a pseudo page which comprises an automatically opening script which initiates the display of one of the pages as described in a respective slave page record 74. When one of the slave pages is requested directly, a different page file may be sent, in which the slave page is a master.

Alternatively, only an indication of which page to show first, is changed. Alternatively, the opening script may connect to web server 24 or compression unit 26 to receive an indication of

which page to show first. Alternatively, a separate file including such an indication is sent to the client.

Optionally, some or all of slave page records 74 comprise compressed HTML page descriptions. Alternatively, at least some of the pages the pages are uncompressed. The slave page records are compressed using any suitable compression method, for example the LZ (Lempel Ziv) method and/or the WLZ (Walsh, Lempel, Ziv) method. Alternatively or additionally, slave page records 74 comprise standard non-compressed HTML page descriptions, for example Java script for displaying the pages.

In some embodiments of the invention, combined file 70 comprises embedded element records 76 which contain descriptions of the embedded elements of the pages represented by combined file 70. Hypertext links to the embedded elements are optionally converted to Java scripts that, conditionally or unconditionally, initiate the display of the contents of respective embedded element records 76, upon displaying the page. In some embodiments of the invention, embedded elements that are included in a plurality of the pages of the group are stored in only a single embedded element record 76, which may be actuated from a plurality of different pages. Alternatively or additionally, some of the embedded elements are stored in separate files and are downloaded responsive to a hypertext link, as is known in the art.

Possibly, the decompressed, shared, embedded objects are stored in a local cache of the browser and/or operating system In some embodiments of the invention, embedded elements which are included in a plurality of pages which are stored in different combined files 70 are repeated in each of the combined files 70. Alternatively or additionally, at least some of the embedded elements are stored only in one of the combined files 70 and when they are required for Web pages in other combined files 70, the combined file containing the embedded element is downloaded. Further alternatively or additionally, embedded elements which appear only in pages of a single file 70 are stored within the file, while embedded elements which are included in pages of a plurality of files 70 are stored in separate files.

In some embodiments of the invention, the contents of combined file 70 are arranged such that the master page 72 may be displayed, partially or in its entirety, by the client, before all of combined file 70 is received. Optionally, master page 72 with the page record 74 of the page automatically displayed and the embedded element records referenced by the automatically displayed page are located at the top (i. e., the first transmitted area) of combined file 70.

Fig. 4 is a flowchart of the acts performed in downloading pages of a Web site, in accordance with an exemplary embodiment of the present invention. Client 28 transmits (80) to Web server 24 a request to view a page of the Web site. Web server 24 responds by transmitting (82) a combined file 70 that includes the requested file as the automatically displayed page. In some embodiments of the invention, each of the pages of the Web site is generally included in a single combined file 70. When more than one of the pages included in a single combined file 70 may be accessed directly, Web server 24 optionally adjusts the combined file (if necessary) before it is transmitted, so that the automatically displayed page of the file is the requested page. Alternatively or additionally, Web server 24 carries a few versions of at least some of the combined files 70, which versions differ in the page automatically displayed when the file is downloaded. Possibly, web server 24 determines which version to transmit (82) according to the requested page, for example, by the request address mapping to a suitable stored file version.

Optionally, some pages of the Web site, are not allowed direct access, by clients, without passing through previous pages of the Web site. In some embodiments of the invention, such pages are included in a combined file 70 in which they are not the automatically displayed page. When a request for such a page is received, the respective combined file 70 is optionally downloaded and a different page from the file is automatically displayed. Alternatively, Web server 24 responds with an error message to such requests.

Thus, Web server 24 can simply prevent direct access to a desired page without passing through an introductory page, for example, a log-in page, which the Web master wants all clients to display before reaching the desired page. Alternatively or additionally, a log-in page or other pre-personalization page may be sent as a separate file 70, with group pages only being generated once the personalization of the pages is determined. The personalization may be applied to HTML files, which are then compressed. Alternatively, they may be applied directly to the compressed files, for example by record replacement.

When client 28 receives combined file 70, it automatically accesses master page record 72 and accordingly displays (84) one of the pages included in the received combined file 70. In some embodiments of the invention, the operation of client 28 in opening combined file 70 is exactly as if a regular HTML file is received. Optionally, the user does not know that combined file 70 is not a regular HTML file. As described above, the displayed page typically includes one or more controls which actuate Java scripts. These controls operate from the point of view of a user of client 28 in substantially the same way as hypertext links. In addition, in

an exemplary embodiment of the invention, the page typically includes one or more hypertext links that relate to Web pages not included in the downloaded file.

The user of client 28 may actuate one of the controls on the displayed page.

Responsive thereto, the respective Java script of the control is actuated. Optionally, the Java control decompresses (86) the contents of the respective slave page record 74, and displays the page, only when needed. Alternatively, some or all of records 74 in a received file 70 are decompressed upon receipt and are stored in a temporary memory in a decompressed form. In some embodiments of the invention, the decompression (86) is performed seamlessly such that the user of client 28 does not notice the decompression.

In some embodiments of the invention, the Java control is a stand alone script which does not require additional commands for operation. Alternatively, the Java control actuates, with one or more specific parameter values, a separate Java script, which is used by substantially all the Java, scripts of pages compressed in accordance with the present invention, for example being a decompression program. Optionally, the separate Java script is stored within a browser of the client. Alternatively or additionally, the separate Java script is provided as an embedded element.

If the user actuates a regular hypertext link, client 28 sends a request accordingly to Web server 24 (or a different, unrelated web server, for links outside the site) that responds by transmitting the desired page (in a regular HTML page or in another combined file 70).

In some embodiments of the invention, if the user actuates in this additional page, a hypertext link which leads to one of the slave pages of the previous combined file 70, the browser finds the referenced combined file 70 in its cache and a parameter in the link leads to the specific desired page within the combined file (e. g., the link in the additional page may be adapted to match the previous combined file 70). Alternatively, client 28 sends a request with the URL of the requested page to Web server 24. In some embodiments of the invention, Web server 24 responds with a short HTML file which includes a Java script that accesses the slave page record 74 of the requested page in the combined file 70. Further alternatively, Web server 24 responds by re-transmitting the requested page, either by itself or with a combined file in which the requested page is the automatically displayed page. Further alternatively, each time a combined file 70 is received by client 28, a Java script which extracts the pages in slave page records 74 into a cache of client 28, as if the pages were received on their own as regular HTML files. Each page is stored in the cache with its URL, such that when a request for the

page is generated when the page is still in the cache, the page is found in the cache and displayed therefrom.

In some embodiments of the invention, Web server 24 hosts, for at least some of the Web pages of the site, a plurality of combined files 70 that include the Web page. The plurality of Web pages include the Web page with different other pages and/or with different compression styles and/or ratios. When client 28 transmits (80) to Web server 24 a request to view the Web page, Web server 24 chooses one of the plurality of files containing the page to be transmitted (82) to client 28, responsive to which combined files the user previously downloaded and/or responsive to a user profile. The user profile may include, for example, a standard user behavior (e. g., whether the user usually actuates hypertext links at the top or the bottom of the page), topics which interest the user and/or user preferences (e. g., voice files, long articles, images). Alternatively or additionally, Web server 24 customizes, on the fly, the combined file 70 in which the requested page is the automatically displayed page, based on the user profile.

Fig. 5 is a schematic illustration of an exemplary simplified Web site 90 and two optional file organizations 92 and 94, in accordance with an embodiment of the present invention. Web site 90 comprises Web pages indicated by digits 1-6 and links between the pages are indicated by arrows. A first optional file organization 92, includes two combined files (A and B). When a page of Web site 90 is requested from Web server 24, the file containing the requested page is transferred to the client, with the requested page being set as the automatically displayed page of the file. This organization, not only provides faster transmission of the Web pages to the user, but also reduces the space required to store the Web site, on Web server 24.

In optional file organization 94, substantially each page (1,2,3,4 and 5) has a separate file (C, D, E, F and G), respectively, which is transmitted to the client if the user first enters the Web site from that particular page. The respective pages of the file, listed first in Fig. 5, are the pages which are automatically displayed when the files are downloaded. Each file (C, D, E, F and G) is customized for its respective page such that the pages included in its slave page records 74 are the pages to which the user is most likely to move from the displayed page. For example, in file E page 3 is accompanied by pages 1 and 5, the only pages in the Web site to which page 3 has hypertext links.

In an exemplary scenario, a user enters the Web site from page 1, and therefore receives file C. From page 1 the user moves to page 4 by actuating the respective Java script

which displays page 4 from the contents of file C. No transmission is thus required from Web server 24 for displaying page 4. From page 4 the user moves to page 3, again using a Java script and the contents of file C. From page 3 the user moves to page 5, which is not included in file C (e. g., the link is not encoded as a Java script to display part of the same file, but as a regular HTTP link). Therefore, a request for page 5 is sent to Web server 24 which responds by transmitting file G to the client. It is noted that in some embodiments of the invention, page 4 is re-transmitted in file G although it was already transmitted in file C.

Alternatively to re-transmitting pages transmitted in other files (e. g., page 4), in some embodiments of the invention, before transmitting a file (e. g., file G), server 24 determines whether one or more of the pages and/or embedded elements in the file were recently transmitted to the client. Such pages and/or embedded elements are optionally replaced in the file by short Java scripts which refer to the pages and/or embedded elements in previously transmitted files. In the current example, the description of page 4 is replaced in file G by a Java script which if actuated displays page 4 based on the contents in file C, which is typically still in the cache of the client. If when the Java script is actuated, file C was already erased from the cache, a request for page 4 will be retransmitted and accordingly, file F will be forwarded to the client.

In file organization 94 there is no file for page 6, as it is assumed that page 6 may not be accessed from outside the Web site. If this is not true, a separate file for page 6 may be included in file organization 94.

Although file organization 94 requires more space on server 24 than file organization 92, it may provide a faster response time as each page is transmitted with the pages to which the user is most likely to move to. Alternatively, file organization 94, or parts thereof, is not actually stored in its entirety on Web server 24. Rather, the files of file organization 94 are generated on the fly from stored building blocks.

Referring in detail to grouping (56, Fig. 3) the Web pages, in some embodiments of the invention, the pages are grouped according to the site map and the usage statistics such that the pages transmitted with a requested page are those which are most likely to be accessed from the requested page. Alternatively or additionally, the pages transmitted with a requested page are those which are most likely to be accessed by the client in the near future.

Alternatively or additionally, the pages are grouped such that when possible all the pages which reference a specific embedded element are included in a single combined file.

In some embodiments of the invention, the pages are grouped responsive to a desired size (or size range) of combined files 70 and/or the bandwidth with which the client connects to the server. Optionally, the desired size is approximately equal to the average size of the files of the original Web pages of the compressed Web site. Alternatively, the desired size is a predetermined percent greater than the size of the original Web pages, possibly a percent which is substantially unnoticed by the client or the user at the client.

In some embodiments of the invention, compression unit 26 splits one or more pages into a plurality of separate pages. For example, in some embodiments of the invention, pages that contain both static information, that never or rarely changes, and dynamic information (e. g. updated hourly), are split into dynamic and static parts, which may be compressed separately. Optionally, when the page is downloaded by a client, a first part of the page (e. g., the static part) is first downloaded to the client, and when the first part is opened by the client, the client requests the second part as an embedded element is normally ordered.

In some embodiments of the invention, when a page transmitted within a slave page record 74 is opened (before any hypertext links of the page are actuated) by client 28 using a Java script, the Java script may initiate for certain pages, the retrieval of another file which includes additional pages which are most likely to be accessed by the user from the current page. Thus, in some cases, the pages are transmitted to the client before the client requested the pages and the latency for waiting for the pages to arrive is shortened.

In some embodiments of the invention, client 28 is customized for use with servers that operate in accordance with the present invention. Optionally, the client displays in a special color (or other indication) links that lead to pages already downloaded. Alternatively or additionally, the client notifies, at the beginning of an HTTP session, which particular parameters compression unit 26 should use, for example, whether embedded elements should be transmitted within the same file as the HTML of the pages or in a separate file. Such notification may be, for example, by the way of cookies.

It is noted that although the above description relates to reorganizing a Web site, the present invention may be applied to substantially any group of Web pages, referred to herein as a virtual Web site. For example, a page which provides search results may be compressed and transmitted to the client with one or more of the pages found in the search. An exemplary implementation is described in Israel application 133,888, the disclosure of which is incorporated herein by reference.

It is further noted that although the present invention has been described in relation to the TCP/IP protocol suite, some embodiments of the invention may be implemented with relation to other packet based transmission protocols, such as, for example IPX, DECNET and the ISO protocols. Furthermore, although the present invention is described with relation to the HTTP protocol, the principles of the present invention may be used with relation to other application protocols, such as WAP (wireless application protocol), WML, and e-mail transmission of pages. For example, instead of transmitting a newsletter or other e-mail with links to one or more sites, the email may include one or more combined files that include some or all of the referenced pages. It is also noted that the present invention may be used for tasks other than transmission, for example, for storage.

In addition, although the hypertext links were described as being replaced by Java scripts, any other scripts or controls may be used, including but not limited to, VB-Scripts, Java applets and activeX scripts.

It will be appreciated that the above described methods may be varied in many ways, including, changing the order of steps, and the exact implementation used. It should also be appreciated that the above described description of methods and apparatus are to be interpreted as including apparatus for carrying out the methods and methods of using the apparatus.

The present invention has been described using non-limiting detailed descriptions of embodiments thereof that are provided by way of example and are not intended to limit the scope of the invention. It should be understood that features and/or steps described with respect to one embodiment may be used with other embodiments and that not all embodiments of the invention have all of the features and/or steps shown in a particular figure or described with respect to one of the embodiments. Variations of embodiments described will occur to persons of the art.

It is noted that some of the above described embodiments describe the best mode contemplated by the inventors and therefore include structure, acts or details of structures and acts that may not be essential to the invention and which are described as examples. Structure and acts described herein are replaceable by equivalents which perform the same function, even if the structure or acts are different, as known in the art. Therefore, the scope of the invention is limited only by the elements and limitations as used in the claims. When used in the following claims, the terms"comprise","include","have"and their conjugates mean "including but not limited to".