Love of Data
eHarmony does three things to figure out the feature set from their data to figure out how they go about matching people based on the questions they answer. (29 dimensions)
1) Compatibility matching
a) obstreperous - noisy and difficult to control
b) romantic -
c) Distance -
d) Ethnicity -
Seems like a graph database use they are doing it internally.
2) Affinity matching
Not all best matches
3) Match Distribution
1) Users - Mongodb
2) then they use classifiers to figure out the compatibility based on user's behavior.
The data they have is growing by 100GB per day. They shared the data based on the location as from the data collected, it seems the successful relationships are more likely to happen within 30 miles. However, communication between long distances are outlier as well but still below the cases happening locally.