Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEM AND METHOD FOR GENERATING ENRICHED PROPERTY LISTINGS
Document Type and Number:
WIPO Patent Application WO/2024/028862
Kind Code:
A1
Abstract:
A computer implemented method for creating an enriched property listing, the method includes retrieving data, by a processing unit, from a structured feed of a plurality of external data sources and deriving a first plurality of features associated with a property, wherein the plurality of external data sources comprises at least one property listing, extracting/calculating data, by a processor of the processing unit, from a non-structured feed of the plurality of external data sources and deriving a second plurality of features associated with the property, and generating, by the processor, an enriched property listing by applying big-data analysis and/or machine learning algorithms on the first and/or second plurality of features, wherein the generating of the enriched property listing comprises classifying the property into one or more classes/categories, wherein applying the big-data analysis and/or machine learning algorithms further provides a prediction of a change in one or more the property's features.

Inventors:
YAKOEL EREZ (IL)
RUBIN ASAF (IL)
Application Number:
PCT/IL2023/050787
Publication Date:
February 08, 2024
Filing Date:
July 30, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ROSETAL SYSTEM INFORMATION LTD (IL)
International Classes:
G06Q50/16; G06F40/20; G06N20/00
Domestic Patent References:
WO2018220387A12018-12-06
Foreign References:
KR20220087702A2022-06-27
JP2020046854A2020-03-26
Attorney, Agent or Firm:
FISHER Michal et al. (IL)
Download PDF:
Claims:
CLAIMS

WHAT IS CLAIMED IS:

1. A computer implemented method for producing an enriched property listing, the method comprising: obtaining information indicating availability of a property, retrieving or collecting, utilizing a processing unit, basic data regarding the available property from a plurality of structured feeds, formatting the basic data into a standard format, preprocessing the basic data to remove duplications, identify property value contradictions and expel incorrect property values, identifying additional property features and neighborhood data, related to the available property, by applying an NLP model on unstructured data sources, the unstructured data sources comprising one or more of social media posts, residential reviews, police reports, municipal reports and/or violations, and news items; applying an Al algorithm or big data analysis on the basic data, the property features, and the neighborhood data to determine a status of one or more current quality-of-life attributes of the available property; applying a predictive Al algorithm or big data analysis on the basic data, the property feature and, the neighborhood data, to determine a predicted future status of the one or more quality-of-life attributes of the available property, and automatically producing an enriched property listing presenting the current and predicted statuses of the one or more quality-of-life attributes.

2. The method of claim 1, further comprising applying a geographical model to calculate direct sunlight hours per day in the property.

3. The method of claim 2, wherein determining the current status of the one or more quality- of-life attributes further comprises applying the Al algorithm or the or big data analysis on the calculate direct sunlight hours per day in the property. The method of any one of claims 1-3, wherein the structured feeds comprises a plurality of property listing of the available property; The method of any one of claims 1-4, wherein the one or more quality-of-life attributes is selected from to safety, environmental friendliness, children friendliness, senior citizen friendliness, crime level, noise level, pollution level, natural light, gentrification, building type, view from window(s), building maintenance level, transportation, community, traffic jams, air pollution, education, cleanness of surrounding areas or any combination thereof. The method of any one of claims 1-5, wherein the Al algorithm is a machine learning algorithm trained on an unsupervised dataset comprising a plurality of properties and their associated current and future statuses of the one or more quality-of-life attributes. The method of any one of claims 1-6, wherein the Al algorithm is a machine learning algorithm is trained on a supervised dataset comprising a plurality of properties and a plurality of labels associated with each of the plurality of properties, the plurality of labels comprising: one or more validated current and future statuses of the one or more quality-of-life attributes. The method of any one of claims 1-7, wherein the producing of the enriched property listing further comprises presenting a timeline for the predicted change in the future status of the one or more quality-of-life attributes. The method of any one of claims 1-8, further comprising providing a prediction regarding a future change in an adjusted marginal price of the property, based on the computed future status of the one or more quality-of-life attributes. The method of any one of claims 1-9, further comprising detecting rare amenities based on the current and/or future statuses of the one or more quality-of-life attributes. The method of any one of claims 1-10, wherein the one or more quality-of-life attributes comprises at least two quality-of-life attributes. The method of any one of claims 1-12, further comprising automatically updating the enriched property listing based on a periodic re-applying of the NLP model on the unstructured data sources. The method any one of claims 1-12, further comprising predicting a period of time of availability of the available property until a transaction is likely to occur, based on the current and future statuses of the one or more quality-of-life attributes.

Description:
SYSTEM AND METHOD FOR GENERATING ENRICHED PROPERTY LISTINGS

FIELD OF THE INVENTION

Embodiments of the disclosure relate to enriched online listings of properties to allow rich data on a property. Embodiments of the disclosure further relate to the capability to predict future changes in features related to the property and the value (current or perceived) of a property by applying Al models on data retrieved from structured and unstructured feeds.

BACKGROUND

Currently, if someone wants to explore the purchase of a real estate property in a particular geographic area, they can review listings that include information regarding the property such as the price, the size of the property, the square footage, the number of rooms, the location, whether it has a doorman, whether it is furnished, its air conditioning or heating system etc.

The information is however typically scarce and mostly includes basic information. When considering purchasing or renting a property, additional, more “personal” information, such as for example crime level, transportation, education, age of children in neighborhood, percentage of neighborhood kids going to local school, parking space for guests etc. is also desired.

There is therefore a need for system and methods capable of providing additional sophisticated and valuable real estate related information.

SUMMARY OF THE EMBODIMENTS

According to some embodiments, there is provided a computer implemented method and/or platform for generation of enriched property listings.

Advantageously, the herein disclosed method and system applies Al models, such as but not limited to, image processing algorithms, natural language processing (NLP) and machine learning (ML), on structured as well as unstructured data feeds to derive features which can subsequently be used to generate an enriched property listing including information, such as for example, crime level, transportation, education, safety, cleanness of surrounding areas and building, noise level in the area, hours of sunlight, elevator reliability, view from the apartment, in their current status and/or in future predictions.

As a further advantage, by applying big data analytics and ML, the herein disclosed method and system allows for price modeling, which provides an adjusted marginal price of the property. Since this adjusted marginal price of the property takes into consideration specific features related to the property, and often, specific features that are important to a certain perspective client, it may be regarded as a "smart" price or a more realistic price which reflects the actual value of the property (whether in terms of monetary value or in terms of perceived value).

As a further advantage, by applying big data analytics and ML, the herein disclosed method and system further provides predictions regarding future features of the property and the area and the value of the property (whether in terms of monetary value or in terms of perceived value, e.g. predicted change in socio-economic status, predicted change in, pollution, construction, infrastructure, traffic, etc.), which may be particularly beneficiary if the property is bought for investment purposes.

In addition, by applying big data analytics, the herein disclosed method and system may advantageously further provide statistical information, such as for example, the duration of a property being listed before sale in the neighborhood, percentage of newcomers in neighborhood, age distribution, and the like.

There is provided herein, in accordance with some embodiments, a computer implemented method for creating an enriched property listing, the method comprising: retrieving data, by a processing unit, from a structured feed of a plurality of external data sources and deriving a first plurality of features associated with a property, wherein the plurality of external data sources comprises at least one property listing; extracting/calculating data, by a processor of the processing unit, from a non-structured feed of the plurality of external data sources and deriving a second plurality of features associated with the property; and generating, by the processor, an enriched property listing by applying big-data analysis and/or machine learning algorithms on the first and/or second plurality of features, wherein the generating of the enriched property listing comprises classifying the property into one or more classes/categories, wherein applying the big-data analysis and/or machine learning algorithms further provides a prediction of a change in one or more of the property's features.

There is provided herein, in accordance with some embodiments, a computer implemented method for creating an enriched property listing, the method comprising: retrieving data, by a processing unit, from a structured feed of a plurality of external data sources and deriving a first plurality of features associated with a property, wherein the plurality of external data sources comprises at least one property listing; extracting/calculating data, by a processor of the processing unit, from a non-structured feed of the plurality of external data sources and deriving a second plurality of features associated with the property; and generating, by the processor, an enriched property listing by applying big-data analysis and/or machine learning algorithms on the first and/or second plurality of features, wherein the generating of the enriched property listing comprises classifying the property into one or more classes/categories, wherein applying the big-data analysis and/or machine learning algorithms provides a prediction of a period of time of availability of the property until a transaction is expected to occur.

There is further provided herein, in accordance with some embodiments, a system for creating an enriched property listing, the system comprising: a processor configured to: retrieve data from a structured feed of a plurality of external data sources and deriving a first plurality of features associated with a property, wherein the plurality of external data sources comprises at least one property listing; extract data from a non-structured feed of the plurality of external data sources and derive a second plurality of features associated with the property; and generate an enriched property listing by applying big-data analysis and/or machine learning algorithms on the first and/or second plurality of features, wherein the generating of the enriched property listing comprises classifying the property into one or more classes/categories, wherein applying the big-data analysis and/or machine learning algorithms further provides a prediction of a change in one or more of the property's features; and a user interface configured to output the enriched property listing.

There is further provided herein, in accordance with some embodiments, a system for creating an enriched property listing, the system comprising: a processor configured to: retrieving data, by a processing unit, from a structured feed of a plurality of external data sources and deriving a first plurality of features associated with a property, wherein the plurality of external data sources comprises at least one property listing; extracting/calculating data, by a processor of the processing unit, from a non-structured feed of the plurality of external data sources and deriving a second plurality of features associated with the property; and generating, by the processor, an enriched property listing by applying big-data analysis and/or machine learning algorithms on the first and/or second plurality of features, wherein the generating of the enriched property listing comprises classifying the property into one or more classes/categories, wherein applying the big-data analysis and/or machine learning algorithms provides a prediction of a period of time of availability of the property until a transaction is expected to occur; and a user interface configured to output the enriched property listing.

According to some embodiments, applying the big-data analysis and/or machine learning algorithms further provides a prediction of a period of time of availability of the property until a transaction is expected to occur. For example, a prediction of how long the property is expected to be available until it will off market for any reason, for example, sold or rented or if the owner regrets and no longer interested in the transaction. Such prediction may be specific per property or per area/neighborhood. According to some embodiments, the prediction may be based at least on past data and/or on identified trends (e.g., trends in property and/or neighborhood related features).

According to some embodiments, the predicted change in one or more of the property's features may include a change directly or indirectly related to the property features. Such change may include either a change related directly to the property itself or to the surrounding of the property, thus related indirectly to the property (e.g. a change in traffic, safety, view, sun light hours, pollution, noise, green spaces, education, economy level of people in neighborhood, required future investment in the property, for example, elevator fix, renovation, etc.)

According to some embodiments, the method may further include/the processor may further be configured to predict a change in the property value, based on the predicted change of the one or more property's features. According to some embodiments, computing the predicted change in the value of the property may be based on at least on one property-specific feature. According to some embodiments, computing the predicted change in the value of the property may include assigning a specific weight value to at least some of the features. According to some embodiments, computing the predicted change in the value of the property may include applying the big-data analysis and/or machine learning algorithm to a subset of property-specific features derived from the first and/or second plurality of features.

According to some embodiments, predicting the change in the one or more of the property's features may be based on an identified trend in the collected/extracted data and/or in features derived therefrom. According to some embodiments, the trend may be identified by applying the big-data analysis and/or machine learning algorithms to the collected/extracted data and/or in features derived therefrom. According to some embodiments, the enriched property listing may be configured to allow an on-demand class/category -based retrieval of the enriched property listing.

According to some embodiments, the method may further include/the processor may further be configured to transform the retrieved data or the extracted/calculated data into a standard format.

According to some embodiments, the method may further include/the processor may further be configured to preprocess the transformed data to remove contradictions and/or duplications.

According to some embodiments, the steps of retrieving data, deriving a first plurality of features, extracting/calculating data, deriving a second plurality of features and/or generating the enriched property listing are conducted continuously and wherein the enriched property listing is continuously updated accordingly. According to some embodiments, the method includes continuous learning on the property features from structured / unstructured data (e.g., reviews, communication with agent, buyer, data from open sources such as social media/newspapers/other publications etc.) and updating the enriched property listing accordingly.

According to some embodiments, the first and/or second plurality of features may include temporal dataset and/or spatial dataset.

According to some embodiments, the classes/categories of the property include a discrete characteristic.

According to some embodiments, the classes/categories of the property comprise an aggregation of characteristics. The aggregation of characteristics may relate, for example, to safety, environmental friendliness, children friendliness, senior citizen friendliness, crime level, noise level, pollution level, natural light, building type, view from window(s), building maintenance level, price, transportation, education or any combination thereof.

According to some embodiments, the machine learning algorithm may be trained on an unsupervised dataset comprising a plurality of properties and their associated enriched data.

According to some embodiments, the machine learning algorithm may be trained on a supervised dataset comprising a plurality of properties and a plurality of labels associated with each of the plurality of properties, the plurality of labels comprising: one or more validated classes/categories of the each of the plurality of properties.

According to some embodiments, the non-structured feed of the plurality of external data sources may include property description, media, social networks, images, sensors signals, video clips, audio signals, chat bots, police reports, municipal reports or any combination thereof.

According to some embodiments, providing the second plurality of features may include image analysis.

According to some embodiments, the method may further include displaying the trend and/or the prediction of the change of the one or more the property's features. According to some embodiments, the user interface may be configured to display the trend and/or the prediction of the change of the one or more of the property's features.

According to some embodiments, the method may further include/the processor may further be configured to provide a timeline for the predicted change of one or more of the property's features based on the identified trend. The trend may be a trend in noise level, pollution level, transportation, safety, crime, gentrification, infrastructure, neighborhood plans, building plans, the property value, neighboring property values or any combination thereof.

According to some embodiments, the method may further include/the processor may further be configured to provide a prediction regarding a future change in the value of the property, based on the identified trend.

According to some embodiments, the method may further include/the processor may further be configured to identify an anomaly, risk and/or opportunity.

According to some embodiments, the enriched property listing further comprises a rating of a property owner.

According to some embodiments, the enriched property listing further may include highlighting of rare amenities.

According to some embodiments, the plurality of external data sources comprises property surrounding-associated data. The property surroundings-associated data source may include: police complaints, residents' complaints, feedback from neighbors, municipal information, municipal rules violations, transportation options nearby, noise producing facilities nearby, direct sunlight hours per day, schools' related data, fire hazard, flood/earthquake zones, type of population, availability of facilities, parking availability, renovations/fixer-uppers nearby, neighborhood characteristics or any combination thereof.

According to some embodiments, the method may further include/the processor may further be configured to apply a geographical model to calculate direct sunlight hours per day in the property.

According to some embodiments, there is provided a computer implemented method for producing an enriched property listing, the method comprising: obtaining information indicating availability of a property, retrieving or collecting, utilizing a processing unit, basic data regarding the available property from a plurality of structured feeds, formatting the basic data into a standard format, preprocessing the basic data to remove duplications, identify property value contradictions and expel incorrect property values, identifying additional property features and neighborhood data, related to the available property, by applying an NLP model on unstructured data sources, the unstructured data sources comprising one or more of social media posts, residential reviews, police reports, municipal reports and/or violations, and news items; applying an Al algorithm or big data analysis on the basic data, the property features, and the neighborhood data to determine a status of one or more current quality -of-life attributes of the available property; applying a predictive Al algorithm or big data analysis on the basic data, the property feature and, the neighborhood data, to determine a predicted future status of the one or more quality-of-life attributes of the available property, and automatically producing an enriched property listing presenting the current and predicted statuses of the one or more quality-of-life attributes.

According to some embodiments, the method further comprises applying a geographical model to calculate direct sunlight hours per day in the property. According to some embodiments, the determining of the current status of the one or more quality-of-life attributes further comprises applying the Al algorithm or the or big data analysis on the calculate direct sunlight hours per day in the property.

According to some embodiments, the structured feeds comprises a plurality of property listing of the available property;

According to some embodiments, the one or more quality-of-life attributes is selected from to safety, environmental friendliness, children friendliness, senior citizen friendliness, crime level, noise level, pollution level, natural light, gentrification, building type, view from window(s), building maintenance level, transportation, community, traffic jams, air pollution, education, cleanness of surrounding areas or any combination thereof. Each possibility and combination of possibilities is a separate embodiment.

According to some embodiments, the Al algorithm is a machine learning algorithm trained on an unsupervised dataset comprising a plurality of properties and their associated current and future statuses of the one or more quality-of-life attributes.

According to some embodiments, the Al algorithm is a machine learning algorithm that is trained on a supervised dataset comprising a plurality of properties and a plurality of labels associated with each of the plurality of properties, the plurality of labels comprising: one or more validated current and future statuses of the one or more quality-of-life attributes.

According to some embodiments, the producing of the enriched property listing further comprises presenting a timeline for the predicted change in the future status of the one or more quality-of-life attributes.

According to some embodiments, the method further comprises providing a prediction regarding a future change in an adjusted marginal price of the property, based on the computed future status of the one or more quality-of-life attributes.

According to some embodiments, the method further comprises detecting rare amenities based on the current and/or future statuses of the one or more quality-of-life attributes.

According to some embodiments, the one or more quality-of-life attributes comprises at least two quality-of-life attributes. According to some embodiments, the method further comprises automatically updating the enriched property listing based on a periodic re-applying of the NLP model on the unstructured data sources.

According to some embodiments, the method further comprises predicting a period of time of availability of the available property until a transaction is likely to occur, based on the current and future statuses of the one or more quality-of-life attributes.

Certain embodiments of the present disclosure may include some, all, or none of the above advantages. One or more other technical advantages may be readily apparent to those skilled in the art from the figures, descriptions, and claims included herein. Moreover, while specific advantages have been enumerated above, various embodiments may include all, some, or none of the enumerated advantages.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. In case of conflict, the patent specification, including definitions, governs. As used herein, the indefinite articles “a” and “an” mean “at least one” or “one or more” unless the context clearly dictates otherwise.

BRIEF DESCRIPTION OF THE DRAWINGS

Some embodiments are herein described, by way of example only, with reference to the accompanying drawings. With specific reference to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of embodiments of the invention. In this regard, the description taken with the drawings makes apparent to those skilled in the art how embodiments of the invention may be practiced.

Attention is now directed to the drawings, where like reference numerals or characters indicate corresponding or like components. In the drawings:

FIG. la is an illustrative flow chart of a method for generating an enriched property listing according to some embodiments; FIG. lb is an illustrative flow chart of a method for generating an enriched property listing including changes in quality-of-life attributes according to some embodiments;

FIG. 2a schematically shows an outline of the herein disclosed Al-based platform for generating an enhanced property listing, according to some embodiments;

FIG. 2b schematically shows an outline of the herein disclosed Al-based platform for generating an enhanced property listing, according to some embodiments;

FIG. 3 schematically shows an outline of the herein disclosed Al-based platform for continuous/periodic adjustment of an enhanced property listing, according to some embodiments;

FIG. 4, schematically shows an outline of the herein disclosed Al-based platform for classification of properties and/or for generating an enriched property listing with classified features, according to some embodiments;

FIG. 5, schematically shows a user interface of an enriched property listing with feature classification, according to some embodiments;

FIG. 6a is an illustrative flow chart of a method for generating an enriched property listing according to some embodiments; and

FIG. 6b is an illustrative flow chart of a method for generating an enriched property listing according to some embodiments.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The following detailed description is of the best currently contemplated modes of carrying out the invention. The description is not to be taken in a limiting sense, but is made merely for the purpose of illustrating the general principles of the invention, since the scope of the invention is best defined by the appended claims.

According to some embodiments, there is provided a computer implemented method and/or platform for generation of enriched property listings. As used herein, the terms "property listings" and "real estate listing" may be used interchangeably and may refer to any printed advertisement, internet posting, or publicly displayed sign of properties/real estate, which are available for purchasing and/or rent.

As used herein, the term "enriched" and "enhanced" with regards to the property listing, may be used interchangeably and may refer to a property listing which includes data that is not typically included in property listings. According to some embodiments, the enriched property listing may, for example, include information regarding trends in status of the neighborhood in which the property is found.

According to some embodiments, the system/method/platform is configured to retrieve data, associated with a property, from both structured and unstructured feeds from various data sources and deriving therefrom features associated with a property.

As used herein, the term "retrieving" may refer to collecting, downloading, saving in folders or otherwise gathering data, preferably from websites, emails, text messages, video clips or the like.

As used herein, the term “value” when used with regards to a property may refer to a monetary value of the property and/or to a perceived value of the property. The terms “perceived value”, as used herein, may refer to a prospective-client-specific perceived value towards a property.

As used herein, the term “quality-of-life attribute” may refer to value associated with living in, or otherwise utilizing, a property. It is understood that the specific quality-of-life attributes listed in an enriched property listing may vary from property to property, depending for example on whether the property is a city property or a rural property, located in a densely populated area or not etc. Optionally, the enrichment may be personalized. For example, a potential client suffering from allergies may request to obtain additional pollution-related quality-of-life attributes while potential client which do sports may request additional sport associated quality-of-life attributes to be listed in the enriched listing.

A non-limiting example of a quality-of-life value include, but are not limited to, personal safety, community, traffic jams, air pollution, crime level, transportation, education, cleanness of surrounding areas and building, noise level, hours of sunlight, elevator reliability, view from the apartment, etc. in their current status and/or in future predictions. According to some embodiments, the quality-of-life attribute(s) may be individual quality-of-life attribute(s), which may vary between subjects depending on his/hers needs and preferences.

According to some embodiments, the term "structural feed" may refer to a feed that stores data in a predefined format, such as templates or forms, with patterns that make them easily searchable. Accordingly, deriving features from a structured feed may, according to some embodiments, require simple saving, duplicate removal, filtering and/or categorizing of data. According to some embodiments, the structured feed from which data is retrieved may include at least one listing of the property.

According to some embodiments, the terms "unstructured feed", "non-structured feed", "unstructured data", "non-structured data", "unstructured data feed" and "non-structured data feed" may be used interchangeably and may refer to a feed that stores data in a manner that is usually not easily searchable, including formats such as, but not limited to: audio, video, social media postings, text messages and the like. Accordingly, retrieving data from an unstructured feed may, according to some embodiments, require more challenging analysis, including application of complex algorithmic models such as, but not limited to, machine learning models, natural language processing models, image analysis algorithms, speech recognition/transcription models and the like in order to extract relevant features therefrom. Non-limiting examples of non-structured data feed is a data feed that provides images, from which images an algorithm capable of extracting relevant information must be applied (e.g. parking availability in proximity to the property during different hours of the day extracted from a "street view" website). Another example of a non-structured data feed is a data feed of police complaints on which algorithms, e.g. NLP models or other algorithm may be applied in order to extract the number, severity, and proximity related to the “area” or “neighborhood” surrounding the listed property. Another example of a non-structured data feed is a data feed describing renovations in a certain neighborhood. It is still necessary to extract those renovations that are big enough or noisy enough to have an impact on residents of the listed property and to extract those renovations that are geographically close enough to have an impact of the listed property.

According to some embodiments, the unstructured feed may be noisy, with missing values, without standardization, with different IDs between different departments, etc. Non-limiting examples of unstructured feeds include:

(i) Permits data

(ii) Zoning data

(iii) Resident complaints (311) which may include different “domains” (such as, but not limited to, complaints regarding noise, mold, elevator, electricity, cooling, heat).

(iv) City rules/regulations violations

(v) Transportation and infrastructure projects

(vi) Flood zone, fires, earthquake etc. (natural disasters risk)

(vii) School data including for example, school performance/rating, percentage of neighborhood kids attending local school, private school options etc.

(viii) Airplane noise

(ix) Police calls / incidents data, neighborhood patrolling frequency, police budgets etc.

(x) Accidents (indicative of dangerous intersection, etc.)

(xi) Community (library usage, community centers, etc).

(xii) Profile of residents in area

(xiii) Images and free text related to neighborhood and/or residents (e.g. Instagram, local newspapers etc. Such data may for example be used to extract features such as view from windows, sunlight in property, optionally per room, etc.)

According to some embodiments, the system/method/platform is further configured to apply big-data analysis and/or machine learning algorithms on the features derived from the structured and unstructured feed, and preferably also classifying the property into one or more classes/categories, so as to obtain an enriched property listing.

According to some embodiments, by applying the big-data analysis and/or machine learning algorithms a trend (and/or an anomaly, risk and/or opportunity) associated with the property may be computed. The trend, in turn, may enable predicting a change that may influence a future value of the property and/or impact a future quality of life associated with living in the property and/or in the neighborhood of the property. As a non-limiting example, the processing system may predict a decrease in the crime -rate in a currently considered-unsafe-area. As another non-limiting example, the processing system may predict a rezoning change that, in the future, is likely to impact transportation and infrastructure overload.

As used herein, the term "big data analysis" may refer to the often-complex process of examining large amounts of data, often from different data sources, to uncover information such as, but not limited to, hidden patterns, correlations, market trends, customer preferences, etc.

As used herein, the term “predicting” with regards to a change in the value of the property, may refer to identification of a trend in a value associated with the property. For example, during the past 5 years traffic has increased in the neighborhood and based on that trend, it is predicted that traffic will further increase. Additionally or alternatively, the term may refer to the anticipation that a certain change will happen irrespective of past years’ experience. For example, based on the construction of a new entrance road into a neighborhood may result in the prediction that a current traffic situation will improve.

According to some embodiments, the big data analysis may include thousands, millions or even billions of datapoints. According to some embodiments, the data may be collected continuously. Accordingly, according to some embodiments, the enriched property listing may be dynamic, i.e. undergo continuous and/or periodic updating, based, for example, on local news updates, feedback from potential buyers, social networks posts, internet posts and the like.

According to some embodiments, the enriched property listing may be in the form of a dashboard, which according to some embodiments, may include scroll-down menues and/or click-buttons.

According to some embodiments, machine learning may be applied to classify the property into one or more categories, such as, based on, for example, children friendly, family friendly, luxury, quiet zone, party zone, environmental friendliness ("green"), hours of daylight, safety, school quality, school alternatives, community strength, etc.

According to some embodiments, the machine learning algorithm may be supervised or unsupervised. According to some embodiments, the machine learning algorithm may be trained on a dataset comprising a plurality of estates and their associated enriched data. According to some embodiments, the data upon which the training is carried out may be labeled according to validated classes/categories of each of a plurality of properties.

Reference is now made to FIG. la, which is a is an illustrative flow chart of a method 100a for generating an enriched property listing according to some embodiments.

In step 110a, information indicating availability of a property is obtained. According to some embodiments, the information may be in the form of a property listing obtained from a real-estate agent.

In step 120a, basic data regarding the available property may be retrieved. According to some embodiments, basic data such as but not limited to the address of the available property, the size of the available property, number of rooms in the available property, date of availability, owner of the available property, amenities associated with the available property etc. Each possibility and combination of possibilities is a separate embodiment. According to some embodiments, the basic data retrieved from structures data sources such as from the obtained property listing and/or from a plurality of additional property listings directed to the same available property listing.

In step 130a, the property listing and/or the basic data retrieved therefrom be formatted into a standard format, using a suitable NLP model, as known to those skilled in the art.

In step 140a, the formatted data may be pre-processed using an NLP model. According to some embodiments, the processing comprises deleting duplications and/or fixing contradictions, as elaborated herein. According to some embodiments, the NLP model may be selected from: Bidirectional Encoder Representations from Transformers (BERT), Robustly Optimized BERT Pretraining Approach (RoBERTa), GPT-3, ALBERT, XLNet, GPT2, StructBERT, Text-to-Text Transfer Transformer (T5), Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA), Decoding-enhanced BERT with disentangled attention (DeBERT) Dialog Flow, Spacy based models or any combination thereof. Each possibility is a separate embodiment.

In step 150a, additional property features and neighborhood data, related to the available property, may be retrieved by applying Al algorithms on unstructured data sources. Non-limiting examples of unstructured data sources include social media posts, residential reviews, police reports, municipal reports and/or violations, news items, text/audio messages from other potential buyers/renters and/or from real-estate agents, phone calls with potential buyers/renters and/or real-estate agents, emails, and data from public and/or private data providers and the like. Each possibility and combination of possibilities is a separate embodiment. It is understood by one of ordinary skill in the art, that different type of data may require different types of algorithms. For example, text data may be analyzed using a same or a different NLP model configured to “read” and “understand” the text. Non-limiting examples of suitable NLP models include Bidirectional Encoder Representations from Transformers (BERT), Robustly Optimized BERT Pretraining Approach (RoBERTa), GPT-3, ALBERT, XLNet, GPT2, StructBERT, Text- to-Text Transfer Transformer (T5), Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA), Decoding-enhanced BERT with disentangled attention (DeBERT) Dialog Flow, Spacy based models or any combination thereof. Each possibility is a separate embodiment. Image data (pictures, video etc.), on the other hand may be analyzed using image analysis algortihms such as, but not limited to, Anisotropic diffusion, Hidden Markov models, Image editing, Image restoration, Independent component analysis, Linear filtering, Neural networks, Partial differential equations, Pixelation, Point feature matching, Principal components analysis, Self-organizing maps, Wavelets or any combination thereof. Each possibility is a separate embodiment. Audio data may initially be transcribed into text using a transcription models, such as but not limited to AWS transcribe API or Speech- to-Text API, and then be subjected to text analysis.

In step 160a, the basic data (obtained in step 140a) and the additional property features and neighborhood data (obtained in step 160a) may be inputted into a big-data algorithm, in order to identify quality-of-life attributes associated with the available property. Non-limiting examples of suitable big-data analytics include linear regression, Logistic Regression, Classification and Regression Trees, K-Nearest Neighbors, K-Means Clustering, fuzzy models and the like. Each possibility and combination of possibilities is a separate embodiment. According to some embodiments, the quality-of-life attributes may be directly associated with the available property. Non-limiting examples of such property related quality-of-life attributes include hours of sunlight in living room (e.g. in heavily populated areas), problems with the realestate owner (e.g. based on filed complaints or social media posts), maintenance of the available property and the like. According to some embodiments, the quality-of-life attributes may be associated with the neighborhood. Non-limiting examples of such neighborhood related quality- of-life attributes include available public parking space in proximity to the available property, during different ours of the day, construction taking place in the proximity of the available property and optionally anticipated duration thereof, most common type of crime in the area of the available property and the like.

According to some embodiments, the big-data analysis may also be configured to identify the rarity of amenities. For example, a doorman in an area where this is not typical. According to some embodiments, the rarity may be qualitatively indicated e.g. by any type of “rear-feature” indication. According to some embodiments, the rarity may be quantitatively indicated, e.g. only 1/100 available properties in this neighborhood have a doorman.

In step 170a, an enriched property listing may be produced, the enriched property listing including the validated/formatted basic data along with identified quality-of-life attributes.

Reference is now made to FIG. lb, which is a is an illustrative flow chart of a method 100b for generating an enriched property listing according to some embodiments. Steps 110b to step 150b may be essentially similar/identical to steps 110a to step 150a, as described for FIG. la above.

In step 170b, the big data analytics, optionally in conjunction with machine learning (ML) algorithm(s), is applied on the basic data and the additional features to compute a predicted change in one or more of the quality-of-life attributes. According to some embodiments, the ML algorithm may be trained on a data set of properties with known changes in quality-of-life attributes. According to some embodiments, the training may be supervised. According to some embodiments, the training may be unsupervised. According to some embodiments, the additional features used/extracted from the unstructured data for the predicting of changes in the quality-of- life attributes may be the same as those used for computing the current quality-of-life attributes. According to some embodiments, some of the additional features used/extracted from the unstructured data for the predicting of changes in the quality-of-life attributes may be the same as those used for computing the current, while others are further additional features. A nonlimiting example of a further additional feature includes future building projects (which may be extracted from unstructured data sources such as city planning decisions, news paper articles etc.)

In step 180b, an enriched property listing may be produced, the enriched property listing including the validated/formatted basic data, identified quality-of-life attributes along with predicted changes therein.

Reference is now made to FIG. 2a, which schematically shows an outline of the herein disclosed Al-based platform for generating an enriched property listing, according to some embodiment.

Starting at the top left side of the figure, a property listing may be received. It is understood that the property listing may be received from various sources. According to some embodiments, the property listing may be received from a real-estate agent or other source, e.g. via email or file, online/offline interface or any other electronic means. According to some embodiments, the property listing may be downloaded from a website. According to some embodiments, more than one (e.g. 2, 3, 4, 5, 10 or more) property listing may be received for the same property. According to some embodiments, when more than one property listing is received, an algorithm may be applied which algorithm may be configured to correlate and/or compare the data in the listing and, based thereon, clean up/identify any duplication and/or contradictions between the listings. According to some embodiments, when contradictions are identified, the algorithm may be configured to issue an alert, request a user input. Alternatively, the algorithm may automatically select one option over another, based, for example, on a predetermined reliability of one source over another, and/or based on the number of listings indicating a same option. According to some embodiments, when more than one property listing is obtained, the listings may be provided in different formats, in which case the algorithm may be further configured to format all listings and/or listing data into a same standard format.

Next, data may be gathered from various structured and unstructured feeds.

According to some embodiments, the structured feeds may include additional property listings of the property. According to some embodiments, the structured feeds may include additional property listings of other properties in the same neighborhood. According to some embodiments, the unstructured feed may be directly associated with the property, such as but not limited to, images, free texts descriptions of the property, social network posts (e.g. Facebook posts) regarding the property and the like or combinations thereof. Each possibility is a separate embodiment. According to some embodiments, the unstructured feed may also include feeds that are not directly associated with the property, such as, but not limited to newspaper articles related to the neighborhood of the property, social network feeds related to the neighborhood (e.g. a neighborhood profile), neighborhood images (e.g. from google street view), text/audio messages from other potential buyers/renters and/or from realestate agents, phone calls with potential buyers/renters and/or real-estate agents and the like or combinations thereof. Each possibility is a separate embodiment.

Various Al models may then be applied on the data to derive features therefrom.

According to some embodiments, speech recognition/transcription models may be applied to audio recordings and/or audio messages to generate a transcribed text. Non-limiting examples of suitable transcription algorithms include AWS transcribe API, Speech-to-Text API, etc. Each possibility is a separate embodiment.

According to some embodiments, NLP models may be applied on text to retrieve specific information from the text, identify keywords or key points in the text and the like. According to some embodiments, the one or more NLP models may include one or more autoregressive language models. According to some embodiments, the one or more NLP may be selected from: Bidirectional Encoder Representations from Transformers (BERT), Robustly Optimized BERT Pretraining Approach (RoBERTa), GPT-3, ALBERT, XLNet, GPT2, StructBERT, Text-to-Text Transfer Transformer (T5), Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA), Decoding-enhanced BERT with disentangled attention (DeBERT) Dialog Flow, Spacy based models or any combination thereof. Each possibility is a separate embodiment.

According to some embodiments, image analysis algorithms may be applied on images (still or video) in order to extract meaningful information therefrom. According to some embodiments, the image analysis may include one or more of classification, feature extraction, pattern recognition, projection and the like. According to some embodiments, the image analysis may include applying techniques such as but not limited to: Anisotropic diffusion, Hidden Markov models, Image editing, Image restoration, Independent component analysis, Linear filtering, Neural networks, Partial differential equations, Pixelation, Point feature matching, Principal components analysis, Self-organizing maps, Wavelets or any combination thereof.

According to some embodiments, features and/or parameters that can be derived from the unstructured feed include, but are not limited to, one or more of:

• maintenance of the house or the cleanness of the neighborhood (e.g. based on information obtained from a real-estate agent, neighbor, potential buyer/renter who paid a visit, images (e.g. street view images) etc.

• sunlight entering property, optionally per room (e.g. by applying geographical model(s) to calculate direct sunlight hours per day in the property).

• developer permits (e.g. based on municipality websites, news reports, advertisements, etc.).

• traffic at various time points of the day (as extracted, for example, from google street view, social networks, complaints.

• building owner rating (as extracted from feeds such as reported building violations, posts of previous renters, etc.).

• contamination (e.g. based on location of nearby factories, news reports, social networks of the neighborhood etc.).

• community facilities/activities and quality of same (e.g. based on local newspaper articles, social networks of the neighborhood etc., percentage of membership).

According to some embodiments, the property listing may be enriched by incorporating one or more of the derived features.

Additionally or alternatively, big-data analytics and or machine learning (ML) algorithms may be applied on the extracted features to predict changes in the value (economic and/or quality-of-life) of the property. According to some embodiments, the prediction may be based on a trend identified based on the extracted features. Additionally or alternatively the prediction may be directly from one or more of the extracted features.

Non-limiting examples of changes/trends that may be predicted include: Noise level - for example, a noise level may be predicted to increase based on a construction permit, opening of a bar/club or the like as derived from one or more structural feeds.

• Pollution - for example, pollution may be predicted to decrease due to the closing of a factory.

• Traffic - for example, traffic may be predicted if new neighborhoods are built while no road construction is being planned.

• View - for example, the view from the window may be predicted to be impaired due to a new building.

• Safety - for example, if a trend in decreased crime rate is observed over time, such trend may be indicative of further improvement.

According to some embodiments, the prediction may be short term. E.g., right now, location of the property is noisy, but noise levels are due to be reduced in a month.

According to some embodiments, the prediction may be long term. For example, right now the crime level of a neighborhood may be high but us predicted to decrease within 5 years because of governmental or municipal investments in the neighborhood, increased police budget and the like.

Reference is now made to FIG. 2b, which schematically shows an outline of the herein disclosed Al-based platform for generating an enriched property listing, according to some embodiments. The outline disclosed in FIG. 2b is similar to the outline of FIG. 2a, however, the big-data analytics and or machine learning (ME) algorithms are applied on the extracted features to provide an adjusted marginal price of the property, which may then be presented in the enriched property listing. According to some embodiments, the adjusted marginal price may be computed by applying big-data analysis and/or machine learning models on a subset of the features, such as features directly and/or specifically relating to the property, referred to herein as property specific features (as opposed to, for example, neighborhood features). Non-limiting examples, of property specific features include one or more of: property specific amenity features (e.g. including/not including swimming pool, washing machine, basement and the like), property specific view/sunlight features (e.g. living room view, hours of sunlight entering the property and the like), property specific noise related features (e.g. bus-stop right outside the property, next door noise creating facility and the like), property specific people related features (e.g. landlord complaints, neighbor complaints and the like).

Reference is now made to FIG. 3, which schematically shows an outline of the herein disclosed Al-based platform for continuous/periodic adjustment of an enhanced property listing, according to some embodiments.

According to some embodiments, an enriched property listing (as generated for example as outlined with regards to FIGS. 2a and 2b, may be constantly or periodically (e.g. once a day or once a week) updated, for example, by applying a web crawler, (also referred to as a spider, spiderbot or crawler) to systematically browse the internet for new feeds (structured and unstructured), feedback from clients/agents (in the form of text/audio messages, phone calls, emails or the like), and data from public and/or private data providers. According to some embodiments, the newly gathered data may be analyzed in order to identify whether or not it includes new information, for example by applying NLP algorithms capable of identifying key content in the data. The above-described Al models may then be reapplied on the new data in order to update the enriched property listing, in order to update the computed predictions related to the property and/or to update the marginal price of the property accordingly.

Reference is now made to FIG. 4, which schematically shows an outline of the herein disclosed Al-based platform for classification of properties and/or for generating an enriched property listing with classified features. FIG. 4 is similar to FIG. 2a with respect to the gathering of data from structured and unstructured databases. However, in addition to or as an alternative to, the prediction computed by applying big data analytics and/or ML algorithms, as outlined for FIG. 2a, FIG. 4 schematically illustrates a scenario where one or more ML algorithm is applied on the extracted features in order to classify the property into one or more classes. Non-limiting classes into which the property may be classified include: safe properties, green properties, environmentally friendly, strong education, children friendly, family friendly, luxury, quiet zone, party zone and the like. According to some embodiments, a property may be classified into more than one class (e.g. as being green and safe). Non-limiting examples of suitable big-data analytics include but are not limited to linear regression, Logistic Regression, Classification and Regression Trees, K-Nearest Neighbors, K- Means Clustering, fuzzy models and the like. Each possibility and combination of possibilities is a separate embodiment.

Non-limiting examples of suitable algorithms include convolutional neural network (CNN), recurrent neural network (RNN), long-short term memory (LSTM), auto-encoder (AE), generative adversarial network (GAN), Reinforcement-Learning (RL) and the like, as further detailed below. In other embodiments, the specific algorithms may be implemented using machine learning methods, such as support vector machine (SVM), decision tree (DT), random forest (RF), and the like. Each possibility and combination of possibilities is a separate embodiment. Both “supervised” and “unsupervised” methods may be implemented.

According to some embodiments, the classification may be a binary classification (e.g. safe/not safe). According to some embodiments, the classification may be continuous/ranked (e.g. average age of house owners in neighborhood below 30, 30-40, 40-50, 50-60, above 60). According to some embodiments, the classification may be applied on the predictions.

According to some embodiments, an enriched property listing with classification of features may be provided. As a non-limiting example, the property listing may include click buttons, as schematically illustrated in FIG. 5, which when clicked/selected, generates a list (or other type of display) of features/attributes extracted from the data (structured and unstructured) that belong to a certain class of features/attributes. For example, the enriched property listing may be in the form of a dashboard including a plurality of click buttons, each click button linked to a page/document related to a specific class of features/attributes. Optionally, the dashboard may further include a click button linked to a page/document listing or otherwise displaying information related to the future value of the listed property. It is understood that the click buttons shown in FIG. 4 are illustrative only and that additional or other features may be incorporated.

Reference is now made to FIG. 6a, which schematically illustrates a flow chart of the herein disclosed computer implemented method 600 for creating an enriched property listing, according to some embodiments. Method 600 may include a step 610 in which the processing unit obtains and/or retrieves data from a structured feed of a plurality of external data sources and derives therefrom a first plurality of features associated with a property (and storing the data in its memory or in an external memory (e.g. a cloud based storage unit)). The first plurality of features may include temporal datasets and/or spatial datasets. The plurality of external data sources may comprise one or more property listings.

According to some embodiments, the first plurality of features may include the address of the property, the floor, the size in square footage, number of rooms, number of bedrooms, the price and other basic information such as whether there is a doorman, furnishings, how old the building is, etc.

Method 600 may further include a step 620 of retrieving data from unstructured feeds extracting data, e.g. by applying Al models thereon, and deriving a second plurality of features associated with the property. The second plurality of features may likewise include temporal datasets and/or spatial datasets. The non-structured feed of the plurality of external data sources may include media, social networks, images, sensor signals, video clips, audio signals, chat bots, public reports (e.g., police reports, municipal reports) or any combination thereof. Each possibility is a separate embodiment.

The data may be collected from different sources, each of which has a different format. After the data is received from the various sources, the collected data may be transformed into a standard format. Even after being transformed into a standard format the data may have self- contradictions that need to be removed. Accordingly, method 600 may include a step of preprocessing of the transformed collected data to remove contradictions and/or duplications. The retrieved or extracted data and/or data obtained from the one or more property listings may comprise a temporal dataset and/or spatial dataset.

Method 600 may also include a step 630 of generating, by the processor, an enriched property listing by applying big-data analytics and/or machine learning algorithms on the first and/or second plurality of features. The generating of the enriched property listing may comprise classifying the property into one or more classes/categories. The classes/categories of the property may be labeled in the form of a discrete characteristic (e.g., red brick house, low crime level in a neighborhood, a lot of sunlight) or the classes or categories may be labeled so in the form of an aggregation of characteristics (e.g., prestige homes, fashionable homes, modern homes, apartments suitable for singles, homes or apartments suitable for families). The one or more classes/categories or aggregation of characteristics may relate to any number of features such as safety, environmental friendliness, children friendliness, senior citizen friendliness, crime level, noise level, pollution level, natural light, building type, view from window(s) (view of the city, view of a park or view of water), building maintenance level, price, proximity to transportation, proximity to good schools etc. or any combination thereof. Each possibility is a separate embodiment.

According to some embodiments, the machine learning algorithm may be trained on an unsupervised dataset comprising a plurality of properties and their associated enriched data. Unsupervised machine learning discovers patterns and may cluster the data. For example, a particular estate/property may have associated with it some or all of the following data points and the unsupervised machine learning algorithm executed by the processor may discover patterns from a combination of data from structured feeds and data from unstructured feeds with respect to features such as crime level, safety level, noise level, pollution level, availability of good schools nearby, availability of places to go out nearby, availability of work opportunities nearby, school quality, amount of sunlight, renovations nearby, construction nearby, availability of transportation nearby, safety, crime level, gentrification, infrastructure, neighborhood plans, building plans, the property value, neighboring properties value, residents’ complaints, rezoning implications (big changes in the area), infrastructure overload index, length of period of construction or any combination thereof. Any dataset that is not labeled or classified is considered unsupervised.

According to some embodiments, the machine learning algorithm may be trained on a supervised dataset comprising a plurality of properties and a plurality of labels associated with each of the plurality of properties, the plurality of labels comprising: one or more validated classes/categories of the each of the plurality of properties. In addition, certain non-limiting examples of a label comprise any feature or aggregation of features such as safety, environmental friendliness, children friendliness, senior citizen friendliness, crime level, noise level, pollution level, natural light, building type, view from window(s) (view of the city, view of a park or view of water), building maintenance level, price, proximity to transportation, proximity to good schools etc. or any combination thereof. Each possibility is a separate embodiment. In some embodiments, the classifying includes a regression-based problem solving. As one non-limiting example of this, the classifying may involve, for example, determining the number of hours of sunlight per day in this property based on a correlation between the amount of sunlight hours of other properties and, for such other properties, determining which direction the windows face, the size of the windows, which floor the property is on and the height and distance from the property of nearby tall buildings. From this data a linear correlation may be found so as to predict the amount of sunlight in this listed property. Similar regression-based problem solving can be done with other features of the property.

According to some embodiments, applying the big-data analysis and/or machine learning algorithms may include, result in or be followed by identifying a change/trend (step 640) in the extracted/collected data and/or in features derived therefrom. The changes/trends may relate to a number of types of data and may relate to one or more features derived from the extracted data. The change/trend may also be displayed to a user. For example, the processor may identify a change/trend in one or more or two or more of or three or more of neighborhood characteristics, availability of restaurants, shops, parks, etc. within a certain number of minutes' walk, a change/trend in noise level, pollution level, transportation, safety, crime level, gentrification, infrastructure, neighborhood plans, building plans, the property value, neighboring properties' value, residents’ complaints, rezoning implications (big changes in the area), infrastructure overload index, length of period of construction, converting a garbage lot to a park, impact of potential renovations/fixer-upper on price, sunlight, noise or other features or data or any combination thereof. Each possibility is a separate embodiment.

In one non-limiting example, applying the big-data analysis and/or machine learning algorithms, for example using an unsupervised dataset, identifying a change/trend includes identifying relationships, anomalies, and/or patterns detection. In one non-limited example, using permit data obtained from the municipality rules violation data, when a specific developer gets a permit, residents in the relevant area complain about noise and dust after permitted working hours by calling the dedicated municipal phone line (for example “311”). This data may be continuously obtained and updated.

According to some embodiments, applying the big-data analysis and/or machine learning algorithms may further include, result in identifying anomaly (for example, a specific apartment that receives sunlight in Manhattan, rare amenities such as a doorman in an area where a building with doormen is scarce, washing machine inside the apartment, etc.), risk (for example, problems in construction, infrastructure, elevator, etc., construction nearby by a contractor known to be problematic, etc.) and/or opportunity (for example, gentrification, upscaling of infrastructure, an increasing number of profitable flipping, etc.).

According to some embodiments, method 600 may include step 650 of predicting a change of one or more of the property’s features optionally based on the identified change/trend. Step 150 may also include providing a timeline of the predicted change of the one or more features of the listed property based on the identified change/trend. The prediction of the change may also be displayed to the user.

Step 650 may include predicting changes over time in the property based on changes/trends. This may be based on a number of criteria, including but not limited to (i) rezoning implications (big changes in the area) - transportation, price, infrastructure overload index, building period, garbage lot to park, etc. The infrastructure overload index is prepared by the processing unit based on comparisons of infrastructure near the property compared to the amount of infrastructure in other areas. Other criteria include (i) price trends, (iii) crime & safety (based on police plans and budgeting, etc.) (iv) gentrification detection, (v) residents’ complaints (vi) the impact of potential renovations/fixer-upper on the price. The prediction may also be regarding a future change in the value of the property based on any of the identified changes/trends (of the one or more features) described herein. The prediction may relate to a change in how valuable an investment the property will be in a certain period of time, for example based on the construction of surrounding area facilities, gentrification, trends in one or more features etc. The prediction may be of a change in non-price related features or data or in price -related features or data. In one non-limiting example, step 650 includes examining and predicting the future and expected changes in all data points based on building/neighborhood plans.

Reference is now made to FIG. 6b, which schematically illustrates a flow chart of the herein disclosed computer implemented method 200 for creating an enriched property listing, according to some embodiments. Steps 210, 220 and 230 may be similar to steps 610, 620 and 630, respectively. Step 640 includes computing an adjusted marginal price of the property based at least on the big-data analysis and/or an output of the machine learning algorithms, as essentially elaborated herein. The plurality of external data sources may comprise property surrounding-associated data from a non-structured feed. This may include data not directly involving the property but directly pertaining to properties surrounding the listed property, infrastructure surrounding the listed property, the neighborhood surrounding the listed property, the police station surrounding the listed property, construction projects surrounding the listed property, transportation infrastructure (e.g. train stations, bus stations) surrounding the listed property, school facilities surrounding the listed property, zoning of property surrounding the listed property, parks, parking and other characteristics of infrastructure surrounding the listed property.

The property surroundings-associated data source may include: police complaints, residents’ complaints, feedback from neighbors, municipal information, municipal rules violations, transportation options nearby, noise producing facilities nearby, direct sunlight hours per day, schools related data, fire hazard, flood/earthquake zones, type of population in the neighborhood, availability of nearby facilities for eating, socializing and education, parking availability, renovations/fixer-uppers nearby, neighborhood characteristics or any combination thereof.

In one implementation, the extracting of data from the non-structured feed may include extracting or calculating, by the processor, data associated with the property from multiple non- property-listing data sources where the data comprises data associated with features of an area surrounding the property. In some embodiments, this kind of data captures the features of the lifestyle of someone living in the listed property. For example, whether there is a crime in the area, what schools are in the area, and how good the schools are, nearby shopping, whether there is construction predicted for the area, proximity to parks, whether the price is expected to move upward, - all these things will have an impact on the life of the resident. In many cases, such data is extracted from non-structured feeds.

The associated surrounding area data may include information about the availability of permits, zoning data, violations of city rules from municipal databases, transportation and infrastructure projects in the area, whether the property is in a flood zone, whether there have been fires, earthquakes in order to obtain an overall risk of natural disasters and residents’ complaints about many different things including rates, noise, airplanes, music, rodents, elevator, electricity, air conditioning system, heating system, data concerning how close and how highly rated are the nearby schools, the availability of rezoning the property, data relating to crime and safety in the neighborhood or area in which the property listed is found. Such crime and safety data may be in a comparative form relative to crime and safety data for other neighborhoods and many include or be based on such metrics as how many incidents of police complaints (or other crime statistics) have been received in the area, for example during a preceding time period such as the last 12 months, whether the nearby police station is a new police station and what is the budget is of the police in the area. In addition, the processing unit stores on memory crime statistics in similar form for the whole city or for a larger and/or smaller entity (e.g., a state of the United States, a borough of the city of New York, etc.).

In some embodiments, the processor makes a comparison between the incidents of crime in the area surrounding the property and in the larger municipal or state or other area. Based on the comparison, the processer may have the online enriched property listing state a conclusion such as “low crime” or “below average crime” within the property listing for that apartment or house in a section of the property listing called “Safety” that is reached after clicking on a heading called “Area Insights”.

According to some embodiments, the processor applies big-data analysis and/or machine learning algorithms to identify a change/trend (and/or an anomaly and/or risk and/or opportunity) pertaining to the extracted data or to one of the features derived from the data extracted or calculated from the non-structured feed. For example, the processor may identify a change/trend relating to crime rates in the area immediately surrounding the listed property. Consequently, the enriched property listing may display this trend relating to the crime, for example based on the police plans and budgetary information. The trend may relate to noise level, pollution level, transportation, safety, crime, gentrification detection, infrastructure, neighborhood plans with or without rezoning implications (big changes in the area), building plans, the value of the property or of neighboring properties, infrastructure overload index, building period, converting a garbage lot to a park, residents’ complaints, price trends including from the impact of potential renovations, etc. or any combination thereof The infrastructure overload index is prepared by the processing unit based on comparisons of infrastructure near the property compared to the amount of infrastructure in other areas. Based on the identified change/trend, the processor may make a prediction of a change in one or more feature based on to the change/trend identified and the enriched listing may display the prediction. According to some embodiments, the prediction may be made as a separate statement or may be part of a storyline concerning the property. For examples, a storyline could be “right now the crime level is high but in 5 years we predict it to be lower”, or “the neighborhood is noisy now due to construction but is expected to be very quiet in two years from now”.

Other non-limiting examples of associated surrounding area data extracted or calculated from a non-structured feed that may be extracted or calculated may include whether there is a dangerous nearby intersection and how many vehicle or other accidents there have been in the surrounding area, especially close by to the property/estate and whether or not there are dangerous intersections nearby, etc. In these cases, too, trends may be determined by the processor using the big data analysis and/or machine learning algorithms, to identify changes/trends relating to these features and to make and display a prediction based on the identified changes/trend.

Further non-limiting examples of associated surrounding area information include social information, for example, how close is the property to appealing places for a couple or a family to go out to. Such places can include a wide range such as clubs, the opera or movies or anything else. Another kind of data is how close the property is to parks, how close to dog parks, how close to public transportation, how close to schools, how close to highly rated schools, how close to day care centers, how close to water parks, how close to restaurants, how close to theatre, how close to shopping such as malls, how close to grocery stores and other stores that sell daily essentials. In addition to “how close to” these types of places, related data of how many of these types of places are in the vicinity of the property. Trends relating to this data may also be identified and predictions made and displayed in the enriched property listing.

Another parameter may be a calculation of the amount of sunlight based on a variety of factors that may include applying complex geographical models considering the day in the year and other factors, such as the topology of the building and the surrounding area.

A further type of surrounding area data is data that the processer, using big data and/or machine learning algorithms, may identify a trend concerning property values, trends concerning tax abatements, trends concerning zoning laws, trends or changes in construction nearby affecting sunlight, property value and the amount of outdoor space and views, accessibility to parking, transportation and parks etc., trends or changes in demographics of neighboring residents (including gentrification), trends or changes in crime expected, trends or changes in educational opportunities based on new schools or closing of old schools, trends or changes in pollution, trends or detected changes in the risk of natural disasters, trends in resident complaints. In some cases, the source of data include anonymous cell-phone/applications data that can detect a profile of new residents in an area and determine, for example, the type of new residents who are moving in to the surrounding area.

The above examples are not in any way exclusive and, a trend identified in any features or data described herein may result in a prediction of a change in a feature of the property.

Unlike the prior art, the surrounding area data may be continuously obtained so as to continuously update and improve the accuracy of the data. This may be accomplished for example by taking data as to ongoing feedback from agents and/or data providers, potential buyers, advisors and others. In addition, continuous data may be obtained from pictures of houses (and of garden, pool, bricks, fireplaces, etc.). In general, the steps of retrieving data, deriving a first plurality of features, extracting/calculating data, deriving a second plurality of features and/or generating the enriched property listing are in some embodiments conducted continuously and the enriched property listing is continuously updated accordingly.

According to some embodiments, the processing unit may also highlight rare amenities in particular property listings. They are rare relative to other listings in the area. For example, a property in a large city may be stated to have “lots of direct sunlight”.

In some embodiments, the processing unit enriches the property listing based on features characteristic of the owner of the property. For example, the computer processing unit, using big data and/or machine learning algorithms, may construct a rating of the owners of buildings that are typically used for rentals. For example, the data collected on the building owners may include such things as the number and frequency of building violations in buildings other than the listed property. This sheds light on the management style of the owner and sheds light on how the listed property may be managed. Another step of method 600 may include prompting a user to choose which feature to use as criteria for searching for property listings, and to rate what weight the feature should be given influencing a search result in response to a search of property listings by the user. For example, a user may input to the processor controlling interaction with a web site that the feature is so important to the user that all properties without that feature should be automatically excluded from the search results of properties presented to the user. An alternative option that may be presented to the user is to input that the feature is very important and should be given weight so as to influence the search results (for example, by giving more weight or significantly more weight or priority to properties with this feature) but not so important that all listings should automatically be excluded if they lack this feature. Accordingly, the enriched property listing is configured to allow an on-demand class/category -based retrieval of the property listing by a user.

Each of the steps may be regularly updated. For example, one or all of the steps of collecting data, extracting from the collected data a second plurality of features, generating the enriched property listing, identifying a trend and predicting a change in the identified trend, are conducted continuously and the enriched property listing is continuously updated accordingly.

In the description and claims of the application, the words “include” and “have”, and forms thereof, are not limited to members in a list with which the words may be associated.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. In case of conflict, the patent specification, including definitions, governs. As used herein, the indefinite articles “a” and “an” mean “at least one” or “one or more” unless the context clearly dictates otherwise.

It is appreciated that certain features of the disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the disclosure, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as suitable in any other described embodiment of the disclosure. No feature described in the context of an embodiment is to be considered an essential feature of that embodiment, unless explicitly specified as such. Although stages of methods according to some embodiments may be described in a specific sequence, methods of the disclosure may include some or all of the described stages carried out in a different order. A method of the disclosure may include a few of the stages described or all of the stages described. No particular stage in a disclosed method is to be considered an essential stage of that method, unless explicitly specified as such.

Although the disclosure is described in conjunction with specific embodiments thereof, it is evident that numerous alternatives, modifications and variations that are apparent to those skilled in the art may exist. Accordingly, the disclosure embraces all such alternatives, modifications and variations that fall within the scope of the appended claims. It is to be understood that the disclosure is not necessarily limited in its application to the details of construction and the arrangement of the components and/or methods set forth herein. Other embodiments may be practiced, and an embodiment may be carried out in various ways.

The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.

The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable readonly memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a mechanically encoded device having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire. Rather, the computer readable storage medium is a non-transient (i.e., not-volatile) medium.

Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.

Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages.

The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer (or cloud) may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) including wired or wireless connection (such as, for example, Wi-Fi, BT, mobile, and the like). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.

These computer readable program instructions may be provided to a processor of a general-purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware -based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

The following examples are presented in order to more fully illustrate some embodiments of the invention. They should in no way be construed, however, as limiting the broad scope of the invention. One skilled in the art can readily devise many variations and modifications of the principles disclosed herein without departing from the scope of the invention.

EXAMPLES

Example 1 - producing enriched property listing

A property listing including the following data was obtained:

Property listing before enrichment:

Address: XXX St, New York, NY10021,

Price: $20M

Based on this data (here the address only), basic data regarding the property was retrieved from structures data sources including other property listing retrieved for the property listing. The data was pre-processed and formatted into a standard format.

Then additional property features and neighborhood data, related to the property, was retrieved by applying Al algorithms on unstructured data sources, using the retrieved basic data as input to the algorithm, the unstructured data sources included social media posts, residential reviews, websites, police reports, municipal reports, news items, text/audio messages from other potential buyers/renters and/or from real-estate agents, emails, and data from public and private data providers etc. It is understood that while in this specific example GPT2, Hidden Markov models, AWS transcribe API, and were used in this example, one of ordinary skill in the art would readily understand that various other algorithms are likewise applicable (and as such within the scope of this disclosure) and the application of these algorithms on the data is within the abilities of those skill in the art.

The additional property features and neighborhood data extracted were then inputted into a big-data algorithm (here K-Nearest Neighbors), in order to identify quality-of-life attributes associated with the property. As before, it is understood that while K-Nearest Neighbors was used in this specific example, other algorithms are likewise applicable and as such within the scope of this disclosure. It is within the skills of one of ordinary skill in the art to recognize which algorithms are suitable and to implement them on the data.

The big data analysis in conjunction with machine learning (ML) was then applied on the basic data and the additional features retrieved to compute a predicted change in one or more of the quality-of-life attributes. It is understood that while some of the quality-of-life attributes may be directly related to amenities, this analysis further results in charges that are likely to occur in the stated amenities.

All this data is then gathered into an enriched property listing. A non-limiting example of such enriched property listing is provided below. It is understood that the specific quality-of-life attributes listed in the enriched property listing may vary from property to property depending for example if the property is a city property or a rural property, in a densely populated area or not. Optionally, the enrichment may be personalized, for example a potential client suffering from allergies may request to obtain additional pollution-related quality-of-life attributes and potential client which do sports may request additional sport associated quality-of-life attributes listed in the enriched listing. The below illustrated enriched property listing is thus exemplary and illustrative only. Property listing after enrichment (data is just example):

• Address: XX St, New York, NY 10021;

• Price: $20M;

• Price History: Jan 2001 $10M, Aug 2005 $13M, June 2011 $14, Mar 2018 $18M

• Fair Value : $ 19M ;

• Foreclosure: None;

• Violations: None;

• Building type: Condo;

• Hazard: Flood Risk: High; Fire risk: Low; Earthquake risk: Low

• Amenities; Elevator, Doorman, Balcony, Fitness center, Parking, Storage, Terrace, Fireplace, Pool, No air condition, No Garden, No Bricks, laundry in building;

• Nearby POI (Point of Interest): Library, Community center (55+), Central Park, Grocery Store, Many Restaurants;

• Daycare: List of places + ranking

• Schools: Last of schools + Ranking; including predicted Schools and school level (New Schools, trends in scores);

• Neighborhood Demographic: Mostly families, ager 25-40, many children;

• Pollution: Medium; Predicted Pollution: Low (Reason: traffic reduction by changing roads, timeline 2 years);

• Noise: High; Predicted Noise: Low (Reason, end of construction, timeline 2 years);

• Traffic: Heavy; Predicted traffic: Medium (Reason, New underground, timeline 2 years);

• Walkability: Low; Predicted Walkability: High (Open access to Central Park, post constructions, timeline 2 years);

• Bikeable: Low; Predicted Bikeable: High (Open access to Central Park, post constructions, timeline 2 years); • Maintenance level; Superb (40 reviews / mentioned);

• Cleanness of the neighborhood; Good; Predicted Cleanness of the neighborhood (Reason, end of construction, timeline 2 years);

• Estimated time till off market: 6 months;

• Crime Index: Medium; Predicted Crime Index: Low (Reason, end of construction, timeline 2 years);

• Public Transportation: Medium; Predicted Public Transportation: High (Reason, New underground, timeline 2 years);

• Sunlight entering property: High (>8 hours); Predicted Sunlight entering property: Low (Reason, now building in the south, timeline 3 years.

While certain embodiments of the invention have been illustrated and described, it will be clear that the invention is not limited to the embodiments described herein. Numerous modifications, changes, variations, substitutions and equivalents will be apparent to those skilled in the art without departing from the spirit and scope of the present invention as described by the claims, which follow.