- Gloomhaven (ID: 174430)
- Agricola (Revised Ed.) (ID: 200680)
- Gaia Project (ID: 220308)
- Legendary Encounters: Alien (ID: 146652)
- Terraforming Mars (ID: 167791)
- Twilight Imperium 4th Ed. (ID: 233078)
BackgroundOver the past few years I have become pretty obsessed with modern board games. This may sound unusual to the uninitiated but modern board games have come a long way in complexity, theme and just overall entertainment. I also find that board games are a great way to unplug these days and enjoy the company and conversation that comes with people gathered together to play a game. It seemed only natural to combine my interest in board games with my interest in data analysis.
MotivationThe motivation behind this objective similarity service is that, while browsing BoardGameGeek, I often find myself wanting a list of similar games. I picture this being something like a new section in the side banner that has other game characteristics. I knew a potentially more useful list would be a 'recommendation list' but without having direct, speedy access to the underlying database as well as user ranked lists, it would very time consuming to gather the necessary data for a recommendation system. With that constraint, I opted for an objective similarity system that would return a list of games that are similarity based off game characteristics rather than opinion. This is that attempt.
It's important to mention what this isn't - a recommendation system. Some responses may not make immediate sense as to the game relation but under the confines of the game characteristics used, there may be more apparent similarities. This isn't to say that the system is always correct, far from it, it's just that the results might not align with expectations because one might think of the games differently than the model does.
MethodologyThe method being used here is quite straight forward when all is said and done. This system uses Locality Sensitive Hashing to determine the similarity between games. In particular it is taking into consideration the following game characteristics (as defined by BoardGameGeek):
- minimum age
- minimum playtime
- maximum playtime
- maximum player count
A key thing to note is that the hash table was built with data pulled on 2018-06-27 and contained the top 6,000 or so games (ranked according to BGG). This means that the system will only be able to return similar games if they were in the the top 6,000 games at the time.
Future WorkI hope to be compiling a more thorough walk through of how this system was built.
I also hope to look into using the board game description along with some Natural Language Processing (NLP) techniques as another approach to determining game similarity.