Bhrigu Service¶
Bhrigu Service is a medium or an interface where the client can get
- User Profiles
- Predictions
- Recommendations
in real-time. Most of the functionalities are exposed as gRPC endpoints. Find the proto file here. This gRPC service is integrated with API Gateway and can be called asynchronously. Please find environment and corresponding registered service names below.
Environment Service Name Local bhrigu_local
Staging bhrigu_stg
Production bhrigu_prod
User Profiles¶
A User Profile
is a persona of a user in rolled up format. User profiles are stored in the database (datastore schema)
and updated in real-time as user continues his/her journey on the website.
A user profile consists of mainly two parts
userdata
(map<text,map <text,int>>)- This stores data of the user in a map where the key denotes section and the inner map holds the data for that section
userpreference
(map<text,text>)- This stores the pre-computed inferred information from the user data
Example User Profile
{
"userdata": {
"bodystyle": {
"Compact Sedan": 8, "Hatchback": 14, "Sedan": 8
},
"budgetsegment": {
"BP": 16, "C": 9, "CM": 4, "D": 1
},
"budgetsegmentpqs": {
"BP": 1
},
"budgetsegmentwisetimespent": {
"BP": 87, "C": 32, "CM": 17
},
"counts": {
"contentpages": 0, "electriccars": 0, "newcarpages": 47, "otherpages": 8, "pricequotes": 1,
"timespent": 675, "usedcarpages": 0
},
"makepqs": {
"Honda": 1
},
"makes": {
"Fiat": 4, "Ford": 3, "Honda": 7, "Hyundai": 7, "Maruti Suzuki": 6, "Skoda": 1, "Toyota": 2
},
"modelpqs": {
"Jazz": 1
},
"modelpqs_id": {
"547": 1
},
"models": {
"Amaze": 3, "Aspire": 3, "Avventura": 1, "Baleno": 3, "Ciaz": 3, "Elite i20": 2, "Jazz": 4,
"Linea": 1, "Linea Classic": 1, "Octavia": 1, "Platinum Etios": 2, "Urban Cross": 1, "Xcent": 2, "i20 Active": 3
},
"models_id": {
"465": 1, "547": 4, "587": 3, "595": 1, "812": 1, "930": 3, "1044": 2, "1059": 1, "1087": 2,
"1116": 3, "1123": 1, "1161": 2, "1168": 3, "1181": 3
},
"modelwisetimespent": {
"Amaze": 6, "Aspire": 19, "Baleno": 6, "Ciaz": 22, "Elite i20": 20, "Jazz": 19,
"Platinum Etios": 17, "Xcent": 23, "i20 Active": 4
},
"modelwisetimespent_id": {
"547": 19, "587": 19, "930": 6, "1044": 17, "1087": 23, "1116": 22, "1161": 20, "1168": 6, "1181": 4
},
"pagewisetimespent": {
"HomePage": 343, "MakePage": 182, "ModelPage": 122, "VersionPage": 14
},
"price": {
"5": 7, "6": 10, "7": 9, "8": 3, "16": 1
},
"source": {
"google-organic": 55
},
"versionpqs_id": {
"4985": 1
},
"versions_id": {
"4170": 1, "4318": 1, "4889": 1, "4985": 2, "5250": 1, "5647": 1, "5735": 1, "5744": 1
},
"versionwisetimespent_id": {
"5250": 14
}
},
"userpreference": {
"carpreference": "New",
"pricedeviation": "1.9765289440498124"
}
}
One can read the User Profile of any user on run-time to take an informed decision on the website. This can be achieved by
calling the RPC method GetUserProfile
(proto file)
The endpoint is common for all applications (CarWale, BikeWale).
Request Message¶
Below is the explanation for fields in the input request message
UserProfileRequest
application
- Application (
enum
) defined in proto file
cookieid
- Unique user identifier
queries
- List of queries to get different parameters for the user. Each query gets one parameter and there can be multiple queries to get more than one parameters. (Ex: Number of distinct models, New Car Pages Count, Lead Count, Top 5 Models visited, etc.,)
Query¶
A query is a defined protocol to get any parameter from user profile. Currently the supported queries are
- Top k Keys
To get top
k
keys in any sectionExample: Top 3 Body Styles the user is interested in (
Key
-bodystyle
)
- Number of Sub Keys
To get number of keys in any section
Example: Number of distinct models visited (
Key
-models
)
- Value of Sub-Key
To get value of sub-key in any section
Example: New Car Pages Count (
Key
-counts
,Sub-Key
-newcarpages
)
- Value of the Preference Key
To get value of any pre-computed preference
Example: Car Preference of the user (
Key
-carpreference
)
- Value Sum of Sub Keys
To get sum of values sub-keys of a particular key
Example: Total Number of Images seen (
Key
-imagesmodelcount
)
Caller Library¶
We have provided a caller library
in C#
to build the user profile request. This library has pre-defined string constants
for all the Keys
and Sub-Keys
for both the applications.
Note
This library can be referenced by using APEL Nuget server. Versioning for this library is properly done.
Environment | Nuget Address |
---|---|
Local | 172.16.0.11:41111 |
Staging | 10.10.4.55:41111 |
Production | 10.10.4.55:41111 |
Recommendation Engine¶
This generic endpoint provides recommendations of similar items for a given application(CarWale, BikeWale) and an item of particular type(Models, Versions etc). To get recommendations you need to follow a protocol as specified here.
Request Message¶
Below is the explanation for fields in the input request message
application
- Application (
enum
) defined in proto file
itemtype
- Item type (
enum
) defined in proto file. Ex: MODELS, VERSIONS, etc.,
cookieid
- Unique user identifier
item
- Item identifier. Ex: ModelID, VersionID, etc.,
recommendationcount
- Number of items to be returned as recommendations (Max 20)
enableboost
- Flag to toggle boosting based on user preferences. Please refer this to know more about boosting. Explicitly pass
true
to enable personalised recommendations
boostkeys
(optional)- List of metrics which are used for boosting the item score based on a particular user profile. Ex:
bodystyle
,price
, etc.,
Warning
You need to send non-empty and non-null value for cookieid
, if enableboost
flag is set to true
. Otherwise, an error will be thrown.
Background Process¶
This section explains the algorithm and process flow behind the scenes to generate personalised recommendations.
About Recommender Systems¶
To generate personalised recommendations, firstly we need to store similar items for given item on a global level. To do this we have initially
generated item to similar items mapping using Hybrid recommender system
which is a combination of the techniques Collaborative Filtering
(CF) and Content-Based Filtering
(CBF).
Refer this article about Recommendation Systems.
Collaborative Filtering
- Collaborative filtering methods are based on collecting and analyzing a large amount of information on users’ behaviors, activities or preferences and predicting what users will like based on their similarity to other users. A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items without requiring an understanding of the item itself.
Content-Based Filtering
- Content-based filtering is a domain-dependent algorithm and it emphasizes more on the analysis of the attributes of items in order to generate predictions. This technique does not need the profile of other users since they don’t influence recommendation. Basically, this method uses an item profile (i.e., a set of discrete attributes and features) characterizing the item and finding similar items within the corpus. To abstract the features of the items in the system, an item presentation algorithm is applied. A widely used algorithm is the tf–idf representation.
Each of the above algorithms has its own pros and cons. So, to get reliable recommendations we combine results from both the techniques and store them.
Data Points¶
Below are the data points considered for generating recommendations for corresponding application and item types
- CarWale - Models
- Users’ Behaviour (
Collaborative Filtering
)
- Organic Comparisons (Selected by user)
- Price Quotes
- Item Features (
Content-Based Filtering
)
- Body Style
- Avg Price
- Min Price
- Max Price
- Segment
- Sub Segment
- Looks Rating
- Performance Rating
- Comfort Rating
- ValueForMoney Rating
- FuelEconomy Rating
- Car Transmission
- Fuel Type
- Color Type
- Boosting Parameters
- Body Style
- Price Bucket
- Sub Segment
Process Flow¶
The below figure explains the process flow of the entire system
Any user actions on the website are tracked in bhrigu tracking endpoint.
After that user profiles are created in real-time and stored in a datastore. The first part of the above discussed algorithm, Collaborative Filtering
,
uses subset of the same data (details mentioned above) to compute similar items. The second technique which is Content-Based Filtering
takes all available items into
consideration and computes similarity between them on the basis of their features (details mentioned above).
Then final set of recommendations are generated after combining the outputs from both the techniques and these recommendations are stored in a database.
Now we have an endpoint which facilitates getting personalised recommendations. It uses the same recommendations data that we have stored to give
recommendations but the items are reordered based on the user preferences that are inferred from user profiles.
This process reordering, also called Score Boosting
, is explained in the following section.
Score Boosting¶
This process of score boosting adds user flavour to the generated recommendations. Instead of recommending every user with same list of items for a given item, we can target each user based on his/her preferences inferred from the user profile.
The list of recommendations contains an item and its corresponding score. And each item has some attributes (Ex: bodystyle, price, etc.,). A user profile contains his/her preferences for each of these attributes.
We define an Affinity function
which takes input arguments of user profile, boosting parameters (mentioned above), and an item. This function returns a score which
denotes the affinity of the user towards the given input item.
AF(UserProfile, BoostingParams, Item) = Affinty Score
And this affinity score is added to the original score of the item. So after boosting the scores of every item, the list is reordered to give final personalised recommendations.