Match Magic. The Blueprint of Our Algorithm

We're creating an app that matches flatmates and flats through an ML algorithm, unlike current platforms, which offer cluttered threads of images and messages.

Users complete a Typeform detailing their preferences through a series of multiple-choice and open-ended questions. These initial inputs are critical as they are the attributes that directly influence the compatibility factor between potential flatmates.

We've decided to split the process into several stages:

Stage 1:

Stage 1
This is formally known as k-Means Clustering.
The initial pool of users is divided into clusters based on Budget, Location, Age, Gender, etc. This approach speeds up the process by organising users into broad groups, allowing us to work from there.

Stage 2:

Once we have the clusters ready we compare the user profiles within the clusters. Consider answers submitted by User A & User B.

When both users provide the same response to a question, they're awarded 1 point. However, if their responses differ, they receive a score of 0.

Stage 2

In cases where connections involve choices with multiple options, such as university courses, we must quantify their similarity. This is achieved by evaluating the correlation among the choices within these responses. For example, there could be a strong correlation between the matches of users who study physics and those who study maths, giving an 'r coefficient' of 0.7.

Imgur

Stage 3:

Imgur

Each question or parameter is then weighted accordingly. For example, shared interests have a more significant impact on compatibility than differences in sleeping habits, such as being an early bird or a night owl. In our case, the weight takes a value between 1 & 10.

The questions are initially assigned preset weights. However, they are later adjusted by machine learning according to user behaviours—like matching, initiating contact, and messaging frequently. Thus, compatibility is continually optimised based on real interaction patterns.

Users can also prioritise questions in the onboarding, helping us with the weighting.

Stage 4

Imgur
The overall score is the sum of the product of the matched responses and their respective weights. In cases where responses do not match, they are disregarded (scored as 0).

This process allows for flexibility in matching users with varying numbers of responses. For example, a user who answers 12 questions could be an ideal match for another who answers 15, as long as their responses align. This ensures compatibility is based on mutual answers, not the total number of questions completed.

Lastly, the total scores of User A and User B are compared.

Stage 1:

Stage 2:

Stage 3:

Stage 4

What do you think?