Overview
The Final Fantasy series had an incredible effect on the gaming world. Developed by Square Enix and released for the Nintendo Entertainment System (NES) in 1987, the game was considered an outlier at the time: games had simplistic plots and little to no character development. For this reason, the game was named “Final Fantasy,” for the game’s director thought it would be a failure for being a strange game at the time and would end his career. Nonetheless, history differed from expectations. Known for its rich lore, deep interactions between characters, and outstanding music, the series changed the role-playing genre forever. The Final Fantasy series became an international hit, selling more than 164 million units worldwide as of October 2021. Out of all the 95 Final Fantasy games, we chose Final Fantasy XIV for the following reasons: The final goal of this project is to understand the game’s characters and world better through network and text analysis, comprehend how the characters are related to each other, and see the importance of the characters in their respective communities. If you are interested about how we reached our goal or you want to see our code you can check this link to have access to our Explainer Notebook.
We analyzed two datasets for Final Fantasy XIV: The Final Fantasy XIV’s wiki was used to download all characters' descriptions and attributes. The data was extracted using the fandom wiki’s API and stored in JSON. Regular expressions were utilized to extract the description of each character along with the attributes of interest: race and affiliation. Other attributes were also extracted: gender, age, and occupation. If you want to download the files containing the characters data click here. (You will have access to a zip file containing a .txt file for each characters of the wiki.) All the dialogue between characters was obtained for text and sentiment analysis to explore the game’s characters world through text interactions. Our strategy was to use regular expressions that have more false positives than false negatives and then clean the edge cases manually based on our knowledge of the dataset. The following data preprocessing was performed: For text analysis, the following was done: Number of characters: 385 Total Data size: 3.21 MB Number of links (Directed): 1477 Number of links (Undirected): 1131 Most common characters (in-degree): Alphinaud Leveilleur, 56 Most common characters (out-degree): Alphinaud Leveilleur, 33
The fan wiki of Final Fantasy XIV
The game’s dialogue
2.1 - Data Preprocessing
2.2 - Data Statistics:
The network created contains 385 nodes, each one related to character and 1472 edges. Firstly, we discarded the isolated nodes resulting in 335 remaining nodes. Afterwards, we extracted the Giant Component obtaining the final network which has 320 nodes and 1449 links. While there are significant hubs, the network is not dominated by a single entity. It is expected that in an online RPG, the protagonist is not as critical to the game’s plot development relative to single-player games. Studying the in-degree we found that the 5 most connected characters are: Analyzing the out-degree we can see that the 5 most connected characters are: Alphinaud Leveilleur is the most connected character both in the in-degree and out-degree sets. He is a significant character in the game. It might be surprising that the main character is not the most connected one, but it is sensible in the context of the dataset: the protagonist of the game is silent. Typically in such plots, a companion character is talkative and well-connected. This is to minimize boredom and break the monotone of silence in the game. The plots of the degree distributions are reported below. In-Degree distribution shows that only a few characters have a degree higher than 25, while the great majority of them are between 0-10. Out-Degree distribution shows a similar behaviour with the majority of the nodes below a degree of 10. Graphically, the in-degree distribution appears to follow a power law distribution, indicating a scale-free network (real network). The out-degree distribution seems to follow a power law distribution, but it is less skewed to the right, and is more difficult to interpret. We use the powerlaw function below to fit a power distribution and get the gamma values for further analysis. For the in-degree distribution, the exponent value (γ=2.33) indicates that the network is in the scale-free regime (ultra-small world). As for the out-degree distribution, the exponent is almost equal to the critical value of 3 (γ=3.01). While this makes the out-degree distribution more random-like, it is still not the same as a random network, for the value indicate the presence of a double logarithmic correction lnlnN which shrinks the distances of this network relative to a random network. We will compare our network below to a random one of the same probability for further confirmation. As seen above, the network’s degree distribution significantly deviates from the expected random network. Our graphical, power-law and random network comparison are thus in agreement.
FFXIV (Undirected) Network
3.1 - Network Analysis
In order to perform the text analysis, we decided to prepare the data by applying tokenization and lemmatization to the character descriptions obtained from the wikipages.
Words related to the structure of the wiki and not to a specific character of the story, where also removed from the text. The frequency distribution of the 75 most common tokens is reported below.
The frequency plot of the most common words is as expected: the two most common words are “final” and “fantasy,” followed by “warrior” and “light.” It is worth mentioning that the main character of the game is named “The Warrior of Light.” Therefore, the high frequency of these two words is unsurprising. Word clouds are a visual representation of text data, typically used to depict keyword metadata (tags) on websites, or to visualize free form text. In this case, the tags are single words, and the importance of each tag is shown with font size: bigger term means greater weight. Goal: Analyze the word cloud for some groups of characters Strategy: The greatest Affiliations are: Garlean Empire 16 High Houses of Ishgard 10 Doma 9 The output of the wordclouds we obtained are given below.4.1 - Word Clouds
As we are interested in analyzing the relations amongst the characters, we want to try to detect communities and study their characteristics. In order to perform this analysis, we are going to apply the Louvain Algorithm for community detection to the undirected network created previously. The algorithm detects 6 small communities that have less than 10 characters, and 8 bigger communities that have between 20 and 50 characters. In the following plot the network is represented with a different color for every community. The characters belonging to the 5 most populous communities have been reported below. Characters belonging to the same community, are expected to be more connected to each other in the game compared to character belonging to different communities. To prove this point more accurately, it would be necessary to analyze the game more in deep and gather information regarding the actual story behind each character and their true relations with each other in the game. We will not continue this analysis. Instead, we are going to analyze which words are the most representative of each community and finally, we will study the average sentiment of these communities, to understand if there is a common positive or negative feeling among the character of a group. We identified the most descriptive words related to each community. The objective is to understand if different communities have different related words. We used two different methods: the TF and and the TF-IDF. The first one will take into account how much a specific word appear in the text, while the other will also consider how often that word appear through the whole database, adding value to those words that are more characteristic of a specific community, therefore more relevant. The most common words using TF for the top 5 communities are like given below: The most common words using TF-IDF for the top 5 communities are like given below: It appear clear how the TF-IDF analysis gives more interesting results in identifying the most relevant words of a community, by eliminating recurring words as ‘fantasy’, ‘player’ and ‘final’ that are clearly related to the game itself, and thus very common in all the communities. We can also see from the results that the second method does not ever return the same word for different communities, as it happen in the TF analysis, confirming that the TF-IDF is more accurate in detecting the relevant words in a community.
Community 1
Community 2
Community 3
Community 4
Community 5
Alphinaud Leveilleur
Y’shtola Rhul
Estinien Wyrmblood
G’raha Tia
Noah van Gabranth
Alisaie Leveilleur
Thancred Waters
Regula van Hydrus
Tataru Taru
Gerolt Blackthorn
Y’shtola Rhul
Urianger Augurelt
Buscarron Stacks
Cid Garlond
Lina Mewrilah
G’raha Tia
Krile Mayer Baldesion
Foulques
Biggs and Wedge
Adalberta Sterne
Estinien Wyrmblood
Minfilia Warde
Ywain Deepwell
Gaius van Baelsar
Aldis
Krile Mayer Baldesion
Louisoix Leveilleur
Zhai’a Nelhah
Nael van Darnus
Deep Canyon
Tataru Taru
F’lhaminn Qesh
Lalai Lai
Livia sas Junius
Leavold
Unukalhai
Unukalhai
Waldeve
Midas nan Garlond
Mylla Swordsong
Biggs and Wedge
Emet-Selch
Ysayle Dangoulain
Nero tol Scaeva
Wide Gulley
Igeyorhm
Elidibus
Thordan VII
Rhitahtyn sas Arvina
Brithael Spade
Fordola rem Lupis
Lahabrea
Alberic Bale
Vitus quo Messalla
Khloe Aliapoh
Regula van Hydrus
Igeyorhm
Haldrath
Eline Roaille
T’kebbe Morh
Buscarron Stacks
Nabriales
Heustienne de Vimaroix
Bahamut
Zhloe Aliapoh
Foulques
Loghrif
Lucia Junius
Tiamat
Ejika Tsunjika
Ywain Deepwell
Mitron
Rasequin
Mide Hotgo
Mikoto Jinba
Pipin Tarupin
Niellefresne Thaudour
Thordan I
Matoya
Ramza Beoulve
Ysayle Dangoulain
Severian Lyctor
Hraesvelgr
Seiryu
Alma Beoulve
Thordan VII
Midnight Dew
Nidhogg
Genbu
Fran Eruyt
Alberic Bale
Fourchenault Leveilleur
Ratatoskr
Sophie
Ashelia B’nargin Dalmasca
Haldrath
Ardbert
Faunehm
Soroban
Rasler B’nargin Dalmasca
Heustienne de Vimaroix
Branden
Orn Khai
Feo Ul
Ba’Gamnan
Lucia Junius
Cylva
Vedrfolnir
An Lad
Eureka(primal)
Rasequin
Lamitt
Vidofnir
Ezel II
Mutamix Bubblypots
Thordan I
Nyelbert
The Steps of Faith
Titania
Alma bas Lexentale
Hraesvelgr
Renda-Rae
Midgardsormr
Tyr Beq
Jenomis cen Lexentale
Nidhogg
Ryne
Shiva
Doga
Ramza bas Lexentale
Ratatoskr
Beq Lugg
Ravana
Unei
Drake Rhodes
Tiamat
Lue-Reeq
Knights of the Round
Ultima Weapon
Jalzahn Daemir
Faunehm
Gaia
Sephirot
Alexander
Rowena
Orn Khai
Giott
Sophia
Quickthinx Allthoughts
Bajsaljen Ulgasch
Vedrfolnir
Granson
Zurvan
Cloud of Darkness
F’hobhas
The Steps of Faith
Ran’jit
Chieftain Moglin
Radovan
Jihli Aliapoh
Mide Hotgo
Lanbyrd
Kazagg Chah
Omega
Midnight Dew
Olvara
Lightning
Fourchenault Leveilleur
Seto
Noctis Lucis Caelum
Matoya
Sul Oul
Shantotto
Seiryu
The Twelve
Garuda
Genbu
Hydaelyn
Soroban
Zodiark
Beq Lugg
Final Coil of Bahamut
Lyna
Bismarck
Tesleen
Hythlodaeus
Halric
Sauldia
Chai-Nuzz
Tadric
Dulia-Chai
Tristol
Feo Ul
Shiva
Bismarck
Ravana
Sephirot
Sophia
Zurvan
Alexander
Susano
Lakshmi
Brayflox Alltalks
Chieftain Moglin
Ga Bu
Quickthinx Allthoughts
5.1 - Common words in the communities
Alisaie Leveilleur’s community
Estinien Wyrmblood’s community
Warrior of Light’s community
Alphinaud Leveilleur’s community
Edmont de Fortemps’s community
man
garuda
yoshida
ga
marcelloix
woman
final
naoki
bu
character
player
fantasy
final
alisaie
final
hyuran
messenger
fantasy
kobold
fantasy
imp
xv
april
final
ehll
Alisaie Leveilleur’s community
Estinien Wyrmblood’s community
Warrior of Light’s community
Alphinaud Leveilleur’s community
Edmont de Fortemps’s community
woman
garuda
yoshida
alisaie
marcelloix
hyuran
messenger
naoki
kobold
ehll
imp
wind
april
titan
francel
unsavory
xv
fool
warrior
family
nero
statue
director
bu
craftsman
We now proceed performing a sentiment analysis of the communities, after we gather the adequate text information related to the story of the characters. We are going to perform different analyses. Firstly, we are going to use the LabMT dataset, to calculate the sentiment of each token, created from the textual description of the characters obtained from the wikipages, after proper tokenization and lemmatization. Subsequently, we are going to repeat the same process, this time using the text obtained from written dialogues extracted from the game. This second analysis is expected to return a better understanding of the sentiment of the characters, since it will contain more textual information related to the feelings and sentiments of the characters during the game. Lastly, we will use the VADER sentiment analysis tool, applied on the dialogues text. This method will also take into account different aspects of the speeches, like the use of punctuation, capital letters and specific expressions. Data: Text description from the Wikipages Method: LabMT All values are just above 5, showing that there is a slight positive sentiment, althogh since the text used for this analysis was collected from the descriptions of the characters, the information they carry about feelings and sentiments of the characters is quite poor, hence the result is expected. We now repeat the analysis with a more relevant database: written dialogues from the videogame. Data: Dialogues from the Videogame Method: LabMT As a result, all values are all between the value 5 and 6, meaning there is a slight positive average sentiment in all communities. The error bars do not show a great variance in the values, meaning that the character in every community take all similar values. We now proceed with the VADER analysis, using the Dialogue dataset. Data: Dialogues from the Videogame Method: VADER The VADER analysis gives more interesting results. All communities scored a value greater than zero, that means the average sentiment is positive, in particular, the Warrior of Light (Final Fantasy XIV) ’s community, scored 0.869, meaning a great positivity in the text. In this case we can also see from the error bars that there is much more variance in the characters of one community. For example, the Lyse Hext ’s community scored the positive value 0.636 even though the error bars indicate at least one character got a negative result.6.1 - Sentiment Analysis 1
6.2 - Sentiment Analysis 2
6.3 - Sentiment Analysis 3
We analyzed Final Fantasy XIV’s character descriptions and dialogue in this project. After data extraction and preprocessing using the Final Fantasy Fandom Wiki’s API and regular expressions, a directed network was built and found to be behaving as a sparse, scaled-free network. The most linked characters met our expectations based on knowledge of the game, as they are central characters with a significant influence on the plot and the outcomes of the game. The communities found through the Louvain Algorithm have different relevant words, calculated using the TF-IDF algorithm, which could be caused by differences in the plot of the story for those groups of characters. The following sentiment analysis enabled us to study the positivity of the communities, and the differences of sentiment between the text descriptions of the characters and actual dialogues from the game. Using the LabMT database, the sentiment analyses resulted in a mild positive sentiment, both for the text description and for the dialogues, meaning either the words collected from the texts could not transpares the real feelings or the characters truly had a rather neutral sentiment. More interesting results were achieved through the VADER algorithm, that adding the punctuation to the analysis and studying the meaning of the whole sentences rather than the single words, managed to obtain a more realistic description of the feeling in the communities, although with higher variance amongst the characters. The analyses could be improved by adding more information for every character, for example more dialogues, or more backstories. In this way we would be able to better analyze the sentiment of each character and also create more accurate communities, that could be based on their characteristics instead of their connections. For the purpose of this project however, we are satisfied with the knowledge we managed to gather by studying the information that we had about the game, its characters and the their relations.