User:1ctinus/V5 Report

This is the V5 People Report.

Here is the data for the nationalities and its equivalents for all V5 people:

{{collapse top|CSV Dump}}

All,Writers,Artists,Entertainers,Leaders

Abkhazia,1,,,,1

Afghanistan,22,,1,,18

Akkadian,2,,,,2

Albania,5,1,,,3

Algeria,17,4,4,,6

Angola,5,,1,,3

Antigua and Barbuda,3,,,,1

Arabia,72,20,1,,22

Arabic,8,1,,,1

Argentina,71,9,9,10,25

Armenia,13,1,1,,6

Assyrian,7,,,,7

Australia,244,41,16,29,32

Austria,107,12,31,6,7

Austro-Hungary,6,1,,1,

Azerbaijan,8,3,1,,3

Babylon,7,,,,5

Bahrain,2,,,,2

Barbados,7,,1,,3

Basque Country,4,,,,

Belarus,7,2,1,,1

Belgium,45,5,11,2,7

Belize,2,,,,2

Bengal/Bangladesh,26,1,6,1,11

Benin,3,,,,3

Bermuda,1,,,,1

Bhutan,5,,,,5

Black,18,2,4,2,2

Bolivia,10,,,,10

Bosnia and Herzegovina,6,,1,,1

Botswana,3,,,,3

Brazil,130,29,18,11,30

Brunei,4,,,,4

Bulgaria,19,5,2,,8

Burkina Faso,4,,,1,3

Burma,17,1,,1,12

Burundi,2,,,,2

Cambodia,16,,3,2,9

Cameroon,7,1,1,,2

Canada,196,15,19,46,23

Cape Verde,3,,1,,2

Carthage,6,,,,1

Central African Republic,4,,,,3

Chad,4,,,,4

Chile,30,3,2,4,19

China,391,48,39,19,110

Colombia,28,1,4,,16

Comoros,2,,,,2

Congo,20,3,4,,10

Costa Rica,8,,,,6

Côte d'Ivoire,6,,2,,2

Croatia,10,,2,,1

Cuba,29,2,7,2,11

Cyprus,4,,,,2

Czech Republic,38,7,7,,5

Czechoslovakia,3,,,,1

Denmark,56,7,7,10,9

Djibouti,2,,,,2

Dominica,4,1,,,2

Dominican Republic,8,,2,,3

Ecuador,15,,,,12

Egypt,114,6,9,9,70

El Salvador,9,,,,8

England,1014,253,142,159,58

Equatorial Guinea,2,,,,2

Eritrea,2,,,,1

Estonia,3,,1,,

Ethiopia,30,,1,,21

Faroe Islands,1,1,,,

Fiji,4,,,,3

Finland,32,11,7,2,3

Flanders,13,,8,,

France,898,150,152,112,82

Francia,15,,,,14

Gabon,3,,,,3

Gambia,4,1,,,3

Georgia,8,1,2,,3

Germany,602,43,77,31,78

Ghana,13,3,1,2,4

Greece,231,34,29,3,49

Grenada,4,,,,2

Guatemala,19,1,1,,14

Guinea,3,,,1,2

Guinea-Bissau,3,,,,2

Guyana,4,,,,4

Haiti,17,,1,,13

Hawaii,13,,1,1,9

Hebrew,3,,,,1

Hittite,4,,,,3

Honduras,7,,,,7

Hong Kong,29,2,7,9,4

Hungary,55,10,6,2,14

Iceland,9,2,2,1,3

India,450,30,45,141,74

Indonesia,39,1,10,8,14

Iraq,20,2,,,12

Ireland,106,25,10,11,13

Islamic,68,4,1,,2

Isle of Man,1,1,,,

Israel,29,2,5,,12

Italy,436,51,157,49,29

Jamaica,26,1,11,,6

Japan,360,55,88,52,48

Jewish,32,3,1,,1

Jordan,6,,,,5

Kazakhstan,5,,2,,2

Kenya,14,1,1,1,2

Kiribati,1,,,,1

Korea,134,13,34,13,41

Kosovo,1,,,,

Kuridstan,3,1,,1,

Kuwait,5,,,,4

Kyrgyzstan,3,1,,,2

Laos,8,1,,,5

Latvia,6,,,1,

Lebanon,9,,2,1,6

Lesotho,2,,,,2

Liberia,8,,,,5

Libya,2,,,,2

Liechtenstein,1,,,,1

Lithuania,10,,,,6

Luxembourg,12,,1,,11

Macau,1,,,,1

Macedonia,11,,1,,7

Madagascar,11,6,1,,4

Malawi,3,1,,,2

Malaysia,16,,3,3,6

Maldives,4,,,,4

Mali,9,,4,1,4

Malta,1,1,,,

Marshall Islands,1,,,,1

Martinique,2,1,,,

Mauritania,3,,,,3

Mauritius,3,,,,3

Mexico,94,6,14,23,28

Moldova,2,,1,,1

Monaco,8,,,,8

Mongolia,21,,,,14

Montenegro,1,,,,1

Morocco,13,2,,,7

Mozambique,4,,1,,2

Multiple (non-US),317,57,55,34,5

Multiple (US),426,45,67,80,6

N/A,13,,,1,6

Namibia,2,,,,2

Nauru,1,,,,1

Nepal,16,,1,2,10

Netherlands,132,7,26,6,18

New Zealand,44,6,3,7,10

Nicaragua,9,1,,,7

Niger,3,,,1,2

Nigeria,60,7,11,17,16

Niue,1,,1,,

Norway,46,8,8,1,8

Oman,5,,,,4

Ossetia,3,3,,,

Other Africa,26,1,1,1,9

Pakistan,38,2,10,1,13

Palau,1,,,,1

Palestine,6,1,1,,4

Panama,8,,1,,6

Papal States,52,0,,,25

Papua New Guinea,3,,,,2

Paraguay,11,1,,,9

Persia/Iran,135,18,15,9,47

Peru,27,2,1,,19

Philippines,59,3,10,21,16

Poland,97,17,14,4,33

Poland-Lithuania,2,,,,1

Portugal,60,4,4,2,15

Prussia,18,,1,,7

Puerto Rico,15,1,7,3,2

Qatar,4,,,,4

Romania,28,2,7,1,10

Rome,163,12,3,2,56

Rus',10,,,,10

Russia,357,84,52,27,46

Rwanda,4,,,,3

Saint Kitts and Nevis,1,,,,1

Saint Lucia,3,1,,,1

Saint Vincent and the Grenadines,1,,,,1

Samoa,4,1,1,,2

San Marino,1,,,,

São Tomé and Príncipe,1,,,,1

Saudi Arabia,15,1,,,10

Scandinavia,4,,,,1

Scotland,138,33,7,9,17

Scythia,1,,,,1

Senegal,8,1,1,2,3

Serbia,14,,2,1,4

Seychelles,2,,,,2

Sierra Leone,9,1,,,6

Singapore,12,2,2,1,5

Slovakia,7,,1,,3

Slovenia,11,7,1,,

Solomon Islands,2,,,,2

Somalia,7,1,,,2

Somaliland,1,,,,1

South Africa,62,8,8,5,15

South Sudan,1,,,,1

Soviet Union,149,18,16,33,20

Spain,219,23,41,16,50

Sri Lanka,16,,,4,9

Sudan,12,2,1,,9

Sumeria,4,1,,,3

Suriname,3,,,,2

Swaziland,2,,,,2

Sweden,93,13,12,14,10

Switzerland,68,4,9,5,

Syria,16,3,,,10

Tahiti,6,,,,5

Taiwan,17,2,5,3,4

Tajikistan,2,,,,2

Tanzania,12,3,3,1,4

Tatar,1,1,,,

Thailand,44,4,9,6,20

The Bahamas,2,,,,2

Tibet,24,,,,3

Timor-Leste,5,,,,3

Timurid,3,1,,,2

Togo,2,,,,2

Tonga,3,,,,3

Trinidad and Tobago,6,,2,,2

Tunisia,5,,,,3

Turkey,69,1,11,4,31

Turkmenistan,3,,,,3

Uganda,9,,,1,4

Ukraine,33,2,2,3,10

United Arab Emirates,4,,,,3

United Kingdom,478,75,38,48,79

United States,4119,567,630,982,138

Uruguay,13,2,,,7

Uyghur,1,,,,1

Uzbekistan,3,,1,,1

Vanuatu,2,,,,2

Venezuela,21,,1,,18

Venice,4,,,,1

Vietnam,28,1,4,,17

Wales,47,10,4,7,6

Yemen,7,,,,5

Yugoslavia,7,1,1,,3

Zambia,2,,,,1

Zimbabwe,8,1,2,,5

{{collapse bottom}}

= Graphs =

= My methodology =

My methodology was not amazing, but I needed something to get the job done. I created a parsing algorithm to look at demonyms listed within the first sentence of articles, which did the job for all but ~700 of the articles. After I added them manually.

Why are some ethnicities listed that aren't nationalities? Parsing reasons. For example, many people were listed as either English or British, despite one being an ethnicity and one being a nationality. That's why they are separate in the data.

Some judgement calls had to be made for some of the historical countries for which nationality to list. I am not perfect.

= Conclusions =

V5 needs to increase coverage of these countries desperately:

  • Vietnam
  • Bangladesh
  • Indonesia
  • Latin America
  • Africa outside of Egypt, Nigeria and South Africa (especially Tanzania and Ethiopia)

The usual suspects of the wealthiest and the Anglosphere are overrepresented in the data. Countries that may be overrepresented include:

  • US (especially in the Entertainers section, but that can be attributed to Hollywood)
  • Sweden
  • England/UK, especially in writers, but this can be mildly attributed to the dominance of English
  • Switzerland
  • Tahiti for some reason?
  • Denmark
  • Australia should not have more people listed than Brazil
  • Roman leaders may be slightly overrepresented

Most importantly, the US and Europe alone accounting for over 70% of the names is the most important problem. I know this will be hard since most editors are American (like me) or European, but it should always be kept in mind.