Connect with us

Artificial Neural Networks

An AI Soulmate Recommender System Based Only On Images

Updated on

Researchers from the UK have used neural networks to develop an entirely image-based recommender system for online dating matches which only takes into account whether or not two users are attracted to each other’s photos (rather than profile information such as job, age, etc.), and have found that it outperforms less ‘shallow’ systems in terms of obtaining an accurate match.

The resulting system is called Temporal Image-Based Reciprocal Recommender (TIRR), and uses Recurrent Neural Networks (RNNs) to interpret a user’s historical predilection for faces that he or she encounters while browsing for potential matches.

The paper is entitled – perhaps dishearteningly – Photos Are All You Need for Reciprocal Recommendation in Online Dating, and comes from two researchers at the University of Bristol, improving notably upon a similar system (called ImRec) released by the same team in 2020.

In tests, the system obtained state-of-the-art accuracy in its ability to predict reciprocal matches between users, improving not only on the researchers’ 2020 work, but also on other content-based dating reciprocal recommendation systems that take account of more detailed, text-based information in dating profiles.

Real World Dating Dataset

TIRR was trained on user information provided by an unnamed ‘popular’ online dating service with ‘several million registered users’, that only allows users to communicate with each other once each has ‘liked’ the other’s profile. The subset of data used included 200,000 subjects, split evenly between men and women, and approximately 800,000 user-expressed preferences across all the dating profiles.

Since the anonymous dating service providing the data only supports heterosexual matches, only male/female matches were covered in the research.

TIRR improves upon previous reciprocal recommender systems (RRS) designs in this field by directly calculating the probability of a match between two profiles, based solely on profile images. Prior systems instead predicted two unidirectional preferences and then aggregated them to obtain a prediction.

The researchers excluded users that had been removed from the dating service (for any reason, including voluntarily leaving), and excluded profiles that did not include face-based photos.

User histories were limited to one year back, in order to avoid potential anomalies that might occur as the dating site tweaked its algorithms over time. They were also limited to a maximum of 15 user preferences, since these were demonstrated as sufficient to prove the model design, whilst more extensive use of preferences degraded performance and increased training times.

Additionally, some of the more avid or long-term users had histories with thousands of preferences, which might have risked to skew the weight of the obtained features, and further prolong training times.

Siamese Network

TIRR is formulated using a Siamese network, typically used for ‘one-shot’ learning.

A template Siamese network, where parallel Convolutional Neural Networks (CNNs) share weights but not data. They also share a loss function derived from the outputs of each CNN, and a ground truth label.  Source:

A template Siamese network, where parallel Convolutional Neural Networks (CNNs) share weights but not data. They also share a loss function derived from the outputs of each CNN, and a ground truth label.  Source:

The network was trained using binary crossentropy, a common loss function in neural networks, and one which the researchers found to give superior results compared to contrastive loss. The latter is most effective in systems that evaluate parity between two faces, but since this is not the objective of TIRR, it’s an approach that performs poorly in this context.

It’s necessary for the system to retain and build on information that it develops as the training iterates many times over the same data, and the Siamese network in TIRR uses an LSTM (Long Term Short-Term Memory) network to make these decisions, and to ensure that features deemed relevant are not discarded ad hoc as the framework builds its insights.

The specific Siamese network architecture for TIRR.

The specific Siamese network architecture for TIRR.

The researchers found that the network trained very slowly when all the data was input, and subsequently split the training into three stages using three different subsets of the data. There is some additional advantage in this, as the researchers’ 2020 experiments had already demonstrated that  training the male and female datasets separately improves the performance of a reciprocal recommender system.

The breakdown of separate training sessions for TIRR's Siamese network.

The breakdown of separate training sessions for TIRR’s Siamese network.


To evaluate TIRR’s performance, the researchers kept a portion of the obtained data to one side and ran it through the fully-converged system. However, since the system is quite novel, there are no directly analogous prior systems to which it could be compared.

Therefore the researchers first established a Receiver Operating Characteristic Curve (ROC) baseline for the Siamese network, before using Uniform Manifold Approximation and Projection for Dimensionality Reduction (UMAP) to slim down the 128-dimensional vectors for easy visualization, in order to establish a coherent flow of likes and dislikes.

On the left, the ROC of the Siamese network as a baseline indicator of performance; on the right, the UMAP visualization shows 'likes' in red, 'dislikes' in black.

On the left, the ROC of the Siamese network as a baseline indicator of performance; on the right, the UMAP visualization shows ‘likes’ in red, ‘dislikes’ in black.

TIRR was tested against collaborative filtering and content-based systems with a similar ambit, including the researchers’ prior work ImRec (see above), and RECON, an RRS from 2010, as well as the collaborative filtering algorithms RCF (a 2015 dating RRS based on text content of dating profiles) and LFRR (a similar project from 2019).

In all cases TIRR was able to offer superior accuracy, though only marginally compared to LFRR, possibly indicating correlating factors between profile text content and the perceived level of attractiveness of the subjects’ profile photos.

The near-parity between image-based TIRR and the more text-based LFRR allows for at least two possibilities: that users’ perception of visual attractiveness is influenced by the text content of profiles; or that text content receives greater attention and approbation than might have occurred if the associated picture was not perceived as attractive.

For obvious reasons, the research team is unable to release the dataset or source code for TIRR, but encourage other teams to duplicate and confirm their approach.


n.b The images used in the main illustration are from