Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DISTRIBUTED PUBLISHER-SUBSCRIBER MODEL
Document Type and Number:
WIPO Patent Application WO/2019/158966
Kind Code:
A2
Inventors:
SHARMA PRATIK (IN)
Application Number:
PCT/IB2018/050888
Publication Date:
August 22, 2019
Filing Date:
February 14, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SHARMA PRATIK (IN)
International Classes:
G06F15/16; H04L29/06
Download PDF:
Claims:
Claims

Following is the claim for this invention: - l. In this invention we provide a high-throughput, low-latency

platform for handling real-time data feeds coming from different customer sites. We use a massively scalable publisher-subscriber model with distributed message blocks containing contiguous set of data records running on different cluster nodes to achieve the above. We store messages coming from different processes running on different virtual machines called producers. Messages are stored in the form of message blocks (containing set of records) which can be divided into different partitions (identified by a unique symbol or name). Set of different partitions belong to a unique customer site identifier or site name and a partition can belong to one and only one site identifier or site name, with set of producers publishing or sending set of data records for that partition. We scale the cluster of nodes we run the distributed publisher-subscriber model on when for a particular customer site the partitions for that site identifier grow and we move different partitions belonging to the same site identifier on different virtual machines. We maintain the dynamic configuration of what partitions of a particular site reside on what virtual machines along with their memory addresses in the cluster. Consumer or group of consumers consuming the message blocks belonging to a site identifier subscribe for different partitions in the site identifier. We also maintain just the offset for a particular partition belonging to a site identifier per consumer since we know the starting memory address of the partition for that customer site. When all the consumers have consumed all the memory blocks of a partition if required then we can take a backup of the partition with a

timestamp and save it to persistent storage, and then we can reset the memory consumed by that partition and start filling its memory blocks from the starting address of that partition. Also there can be only one publisher at a time producing data records and writing into the corresponding partition for that customer site(other publishers or producers wait until the current producer or publisher is done writing) but there can be many consumers consuming data records at a given time as they have their own respective offsets. The above novel technique by which we achieve the distributed publisher-subscriber model is the claim for this invention.

Description:
Distributed Publisher-Subscriber Model

In this invention we provide a high-throughput, low-latency platform for handling real-time data feeds coming from different customer sites. We use a massively scalable publisher-subscriber model with distributed message blocks containing contiguous set of data records running on different cluster nodes to achieve the above. We store messages coming from different processes running on different virtual machines called producers. Messages are stored in the form of message blocks (containing set of records) which can be divided into different partitions (identified by a unique symbol or name). Set of different partitions belong to a unique customer site identifier or site name and a partition can belong to one and only one site identifier or site name, with set of producers publishing or sending set of data records for that partition. We scale the cluster of nodes we run the distributed publisher-subscriber model on when for a particular customer site the partitions for that site identifier grow and we move different partitions belonging to the same site identifier on different virtual machines. We maintain the dynamic configuration of what partitions of a particular site reside on what virtual machines along with their memory addresses in the cluster. Consumer or group of consumers consuming the message blocks belonging to a site identifier subscribe for different partitions in the site identifier. We also maintain just the offset for a particular partition belonging to a site identifier per consumer since we know the starting memory address of the partition for that customer site. When all the consumers have consumed all the memory blocks of a partition if required then we can take a backup of the partition with a timestamp and save it to persistent storage, and then we can reset the memory consumed by that partition and start filling its memory blocks from the starting address of that partition. Also there can be only one publisher at a time producing data records and writing into the corresponding partition for that customer site(other publishers or producers wait until the current producer or publisher is done writing) but there can be many consumers consuming data records at a given time as they have their own respective offsets.