Draft:Semantic Brand Score

{{AFC submission|d|nn|u=182.182.97.3|ns=118|decliner=Bhashaji|declinets=20241122163000|reason2=v|ts=20241122152147}}

{{AFC submission|d|v|u=WarmKomorebi|ns=118|decliner=Johannes Maximilian|declinets=20240924222241|reason2=essay|ts=20240704165503}}

{{AFC submission|d|npov|u=WarmKomorebi|ns=118|decliner=SafariScribe|declinets=20240704101802|small=yes|ts=20240404112441}}

{{AFC submission|d|v|u=WarmKomorebi|ns=118|decliner=Geardona|declinets=20240403215710|reason2=nn|small=yes|ts=20240403151107}}

{{AfC comment|{{Olive|Stemming and sources.}} Stemming of words like "golden" does not remove word affixes because of the Porter's stemming algorithm. It can be tested here http://text-processing.com/demo/stem/ and here https://9ol.es/porter_js_demo.html.

Added additional secondary sources well describing the metric.

This article is significantly different than the one previously deleted (and with much more secondary sources).

WarmKomorebi (talk) 09:30, 18 October 2024 (UTC)}}{{AFC comment|1={{Olive|Connections between words are established based on their co-occurrence within a specified proximity, such as within a sentence. Pre-processing of natural language is preliminary [sic] used to refine texts, involving tasks like eliminating stopwords and word affixes through stemming.}} The proximity in the illustration is not one sentence but three significant words. And it is curious that the "e" of purple is -- I infer -- an "affix" (to allow for purplish, etc?) but the "en" of golden is not.

Perhaps good sources on the "Semantic Brand Score" put this more convincingly. Hoary (talk) 23:41, 30 September 2024 (UTC)}}

{{AFC comment|1=This discussion resulted in the articles deletion, it is similar to the deleted version. Geardona (talk to me?) 22:04, 3 April 2024 (UTC)}}

{{AFC comment|1=Outside of the lede, is un-cited, please add references to the section. Geardona (talk to me?) 21:57, 3 April 2024 (UTC)}}

----

The Semantic Brand Score (SBS) is a measure of brand importance that is calculated on textual data{{Cite journal |last1=Schlaile |first1=Michael P. |last2=Bogner |first2=Kristina |last3=Muelder |first3=Laura |date=2021 |title=It's more than complicated! Using organizational memetics to capture the complexity of organizational culture |url=https://linkinghub.elsevier.com/retrieve/pii/S014829631930582X |journal=Journal of Business Research |language=en |volume=129 |pages=801–812 |doi=10.1016/j.jbusres.2019.09.035}}{{Cite book |last1=Santomauro |first1=Giuseppe |last2=Alderuccio |first2=Daniela |last3=Ambrosino |first3=Fiorenzo |last4=Migliori |first4=Silvio |chapter=Ranking Cryptocurrencies by Brand Importance: A Social Media Analysis in ENEAGRID |series=Lecture Notes in Computer Science |date=2021 |volume=12591 |editor-last=Bitetta |editor-first=Valerio |editor2-last=Bordino |editor2-first=Ilaria |editor3-last=Ferretti |editor3-first=Andrea |editor4-last=Gullo |editor4-first=Francesco |editor5-last=Ponti |editor5-first=Giovanni |editor6-last=Severini |editor6-first=Lorenzo |title=Mining Data for Financial Applications |chapter-url=https://link.springer.com/chapter/10.1007/978-3-030-66981-2_8 |language=en |location=Cham |publisher=Springer International Publishing |pages=92–100 |doi=10.1007/978-3-030-66981-2_8 |isbn=978-3-030-66981-2}}{{Cite journal |last1=Bashar |first1=Md Abul |last2=Nayak |first2=Richi |last3=Balasubramaniam |first3=Thirunavukarasu |date=2022-07-25 |title=Deep learning based topic and sentiment analysis: COVID19 information seeking on social media |url=https://doi.org/10.1007/s13278-022-00917-5 |journal=Social Network Analysis and Mining |language=en |volume=12 |issue=1 |pages=90 |doi=10.1007/s13278-022-00917-5 |issn=1869-5469 |pmc=9312316 |pmid=35911483}}. The measure is rooted in graph theory and partly connected to Keller's{{Cite journal |last=Keller |first=Kevin Lane |date=1993 |title=Conceptualizing, Measuring, and Managing Customer-Based Brand Equity |url=http://journals.sagepub.com/doi/10.1177/002224299305700101 |journal=Journal of Marketing |language=en |volume=57 |issue=1 |pages=1–22 |doi=10.1177/002224299305700101 |issn=0022-2429}} conceptualization of brand equity{{Cite journal |last=Fronzetti Colladon |first=Andrea |date=2018 |title=The Semantic Brand Score |url=https://linkinghub.elsevier.com/retrieve/pii/S0148296318301541 |journal=Journal of Business Research |language=en |volume=88 |pages=150–160 |doi=10.1016/j.jbusres.2018.03.026|arxiv=2105.05781 }}. The metric has been computed by examining different text sources, such as newspaper articles, online forums, scientific papers, or social media posts{{Cite journal |last1=Indraccolo |first1=Ugo |last2=Losavio |first2=Ernesto |last3=Carone |first3=Mauro |date=2023 |title=Applying graph theory to improve the quality of scientific evidence from textual information: Neural injuries after gynaecologic pelvic surgery for genital prolapse and urinary incontinence |url=https://onlinelibrary.wiley.com/doi/10.1002/nau.25133 |journal=Neurourology and Urodynamics |language=en |volume=42 |issue=3 |pages=669–679 |doi=10.1002/nau.25133 |issn=0733-2467 |pmid=36648454}}{{Cite news |last=Kasia |first=Parys |title=Polish Twitter on immigrants during the 2021 Belarus–European Union border crisis |url=https://www.linkedin.com/pulse/polish-twitter-immigrants-during-2021-belaruseuropean-kasia-parys |access-date=2024-04-03 |website=www.linkedin.com |language=en}}{{Cite journal |last1=Das |first1=Sibanjan Debeeprasad |last2=Bala |first2=Pradip Kumar |last3=Das |first3=Sukanta |date=2024 |title=Exploiting User-Generated Content in Product Launch Videos to Compute a Launch Score |url=https://ieeexplore.ieee.org/document/10478487 |journal=IEEE Access |volume=12 |pages=49624–49639 |bibcode=2024IEEEA..1249624D |doi=10.1109/ACCESS.2024.3381541 |issn=2169-3536}}.

Definition and calculation

= Pre-processing =

To compute the Semantic Brand Score, it is necessary to convert the analyzed texts into word networks, i.e., graphs where each node signifies a word. Connections between words are formed based on their co-occurrence within a specified distance threshold (a number of words). Natural language pre-processing is usually conducted to refine texts, which involves tasks such as removing stopwords and applying stemming{{Cite book |last1=Perkins |first1=Jacob |title=Python 3 text processing with NLTK 3 cookbook |last2=Fattohi |first2=Faiz |date=2014 |publisher=Packt Publishing Ltd |isbn=978-1-78216-785-3 |edition=2nd |series=Quick answers to common problems |location=Birmingham}} to eliminate word affixes. Here is a sample network derived from pre-processing the sentence "The dawn is the appearance of light - usually golden, pink or purple - before sunrise".

File:Word co-occurrence network (range 3 words) - ENG.jpg

The SBS is a composite indicator with three dimensions: prevalence, diversity and connectitivy{{Cite journal |last1=Bianchino |first1=Antonella |last2=Fusco |first2=Daniela |last3=Pisciottano |first3=Daniele |date=2021-05-27 |title=How to Measure the Touristic Competitiveness: A Mixed Mode Model Proposal |url=https://www.athensjournals.gr/tourism/2021-8-2-4-Bianchino.pdf |journal=Athens Journal of Tourism |volume=8 |issue=2 |pages=131–146 |doi=10.30958/ajt.8-2-4}}{{Cite book |last1=Beccari |first1=Nicholas |url=https://www.politesi.polimi.it/retrieve/a81cb05d-7655-616b-e053-1605fe0a889a/Thesis.pdf |title=Brand-generated and Usergenerated content videos on YouTube: characteristics, behavior and user perception |last2=Nicola |first2=Valerio |date=2019 |publisher=Politecnico di Milano |location=Milan, Italy}}{{cite book |last=Mercurio |first=Simona |date=2024 |editor-last1=Giordano |editor-first1=Giuseppe |editor-last2=Misuraca |editor-first2=Michelangelo |title=New Frontiers in Textual Data Analysis |publisher=Springer |pages=349–359 |chapter=What About Corruption? A Text Analytics Method for a Scoping Literature Review |series=Studies in Classification, Data Analysis, and Knowledge Organization |doi=10.1007/978-3-031-55917-4_28 |chapter-url=https://link.springer.com/chapter/10.1007/978-3-031-55917-4_28 |isbn=978-3-031-55916-7}}. SBS measures brand importance, a construct that cannot be understood by examining a single dimension alone.

= Prevalence =

Prevalence measures the frequency of brand name usage, indicating how often a brand is explicitly referenced in a corpus. The prevalence factor is associated with brand awareness, suggesting that a brand mentioned frequently in a text is more familiar to its authors. Likewise, frequent mentions of a brand name enhance its recognition and recall among readers.

= Diversity =

Diversity assesses the variety of words linked with a brand, focusing on textual associations. These textual associations refer to the words used alongside a particular brand or term. Measurement involves employing the degree centrality indicator, reflecting the number of connections a brand node has in the semantic network. Alternatively, an approach using distinctiveness centrality{{Cite journal |last1=Colladon |first1=Andrea Fronzetti |last2=Naldi |first2=Maurizio |date=2020-05-22 |title=Distinctiveness centrality in social networks |journal=PLOS ONE |language=en |volume=15 |issue=5 |pages=e0233276 |doi=10.1371/journal.pone.0233276 |doi-access=free |issn=1932-6203 |pmc=7244137 |pmid=32442196|arxiv=1912.03391 |bibcode=2020PLoSO..1533276F }} has been proposed, assigning greater significance to unique brand associations and reducing redundancy. The rationale is that distinctive textual associations enrich discussions about a brand, thereby enhancing its memorability.

Diversity can be calculated for the brand node in a semantic network, i.e., a weighted undirected graph G, made of n nodes and m arcs. If two nodes, i and j, are not connected, then $w_{ij}=0$ , otherwise the weight of the arc connecting them is $w_{ij} \ge 1$ .

In the following, $g_j$ is the degree of node j and $I_{(f)}$ is the indicator function which equals 1 if $f=TRUE$ , i.e. if there is an arc connecting nodes i and j.

$DI (i) = \sum_{j=1,j\neq i}^{n}\log_{10}\frac{n-1}{g_{j}}I_{(w_{ij}>0)}$ .

= Connectivity =

Connectivity evaluates a brand's connective power within broader discourse, indicating its capacity to serve as a bridge between various words/concepts (nodes) in the network. It captures a brand's brokerage power, its ability to connect different words, groups of words, or topics together. The calculation hinges on the weighted betweenness centrality metric.{{Cite journal |last1=Bashar |first1=Md Abul |last2=Nayak |first2=Richi |last3=Knapman |first3=Gareth |last4=Turnbull |first4=Paul |last5=Fforde |first5=Cressida |date=December 2023 |title=An Informed Neural Network for Discovering Historical Documentation Assisting the Repatriation of Indigenous Ancestral Human Remains |url=http://journals.sagepub.com/doi/10.1177/08944393231158788 |journal=Social Science Computer Review |language=en |volume=41 |issue=6 |pages=2293–2317 |doi=10.1177/08944393231158788 |arxiv=2303.14475 |issn=0894-4393}}

The Semantic Brand Score indicator is given by the sum of the standardized values of prevalence, diversity, and connectivity. SBS standardization is typically performed by subtracting the mean from the raw scores of each dimension and then dividing by the standard deviation . This process takes into account the scores of all relevant words in the corpus.

References

External links

https://towardsdatascience.com/calculating-the-semantic-brand-score-with-python-3f94fb8372a6. Tutorial for the calculation of the Semantic Brand Score using Python

{{Draft categories|

:Category:Graph algorithms

:Category:Graph theory

:Category:Network analysis

:Category:Text mining

:Category:Brand management

:Category:Network theory

:Category:Brand valuation

}}