Insights on the Hierarchy of Letters in Scrabble Using Cosine Similarity, Minimum Spanning Tree, and Centrality Analysis

Document Type

Conference Proceeding

Publication Date

3-7-2024

Abstract

This study aims to generate insights on the hierarchy and importance of letters in the game Scrabble by employing two operational research frameworks. Both frameworks begin by using a vector space model whose basis vectors are all the valid Scrabble words and where each letter is treated as a vector. A network of the letters is then constructed where the edge weight between each pair of letters is determined using the corresponding vectors' cosine similarity, which is effectively a measure of the co-occurrence rate of the two letters. The first framework continues by obtaining the minimum spanning tree of the network and performing centrality analysis on the MST. Through the first framework, a hierarchy of the letters is obtained. This hierarchical arrangement shows how letters lower in the hierarchy depend on higher-level letters. On the other hand, the second framework involves performing centrality analysis on the original network of letters and results in a ranking of letters based on their co-occurrence rate with other letters. Based on the frameworks in the study, letter E emerges as the highest ranked letter while the letter Q consistently ranks at the bottom. Thus, the study demonstrates how the two frameworks can be used for a novel application and other possible applications of a similar nature.

Share

COinS