Wikipedia:People by year

{{subcat guideline|editing guideline|Categorization|WP:PBY}}

{{See also|Wikipedia:Overcategorization#Intersection by year or time period}}

Wikipedia:Categorization of people > by year

Each biography is placed in one of the subcategories of Births by year and Deaths by year according to the date of birth and date of death in the article.

The same applies for images of non-anonymous people.

Sample

The article "Julius Schwartz":

:Julius "Julie" Schwartz (June 19, 1915 - February 8, 2004) was ...

includes: {{DEFAULTSORT:Schwartz, Julius}}

Category:1915 births

Category:2004 deaths

placing the article in:

and sorting it by last name.

Use of the categories in Wikipedia

For a discussion about implementing the categories, see Wikipedia talk:People by year/Delete.

Some stats: /Reports/Stats

Which category to use

  1. Year of birth/death is known.
  2. Year of birth/death is approximate.
  3. *Use the categories by year (e.g., :Category:2005 births, :Category:2005 deaths).
  4. Year of birth is unknown.
  5. *Use the categories by century, e.g. :Category:20th-century births.
  6. *Use :Category:Year of birth missing.
  7. Year of death is unknown.
  8. *Use the categories by century, e.g. :Category:20th-century deaths.
  9. *If applicable, use :Category:Missing people.
  10. *Otherwise, use :Category:Year of death missing.

Templates for the descriptions of the category pages

The following templates can be used for the text on the category pages:

see also: Template:Ltm

Assignment of categories

The information to assign the categories can partially be extracted from Wikipedia and uploaded by bot. Some of the possibilities are:

=With lists in Wikipedia=

Lists providing the years:

=With categories=

Articles already categorized in :Category:People can be selected and checked for years.

The following articles subcategorized in :Category:People are not biographies:

  • Articles titled "List of .."
  • Articles in categories titled "Lists .."

1. To select categories

  1. /SQL to find people categories (marginally reliable)
  2. /List of manually selected categories (used instead)
  3. *Input based on specific categories has been used, e.g. Peers, Continental Congressmen, various Olympics categories, etc.

2. To select biographies:

CREATE TABLE temp_people1

SELECT DISTINCT cur_id, cur_title, cur_text, cur_namespace, 0000 AS YOB, 0000 AS YOD

FROM temp_peoplecats, categorylinks, cur

WHERE ct_from_name=cl_to

AND cl_from=cur_id

3. To find the years mentioned in the articles:

DROP TABLE IF EXISTS temp_years;

CREATE TABLE temp_years SELECT cur_id AS y_id, cur_title+1-1 AS y_title, cur_namespace, cur_is_redirect FROM cur

WHERE cur_title RLIKE '^[0-9][0-9][0-9][0-9]$'

OR cur_title RLIKE '^[0-9][0-9][0-9]$'

OR cur_title RLIKE '^[0-9][0-9]$'

LIMIT 5000;

DELETE FROM temp_years WHERE cur_namespace <>0;

DELETE FROM temp_years WHERE cur_is_redirect <>0;

ALTER TABLE temp_years DROP cur_namespace, DROP cur_is_redirect;

4.Sample selection, the result needs to be checked manually.

SELECT cur_title, cur_text, MIN(y_title) AS Y1, MAX(y_title) AS Y2, (MAX(y_title) - MIN(y_title)) AS Diff

FROM temp_people1, links, temp_years

WHERE cur_title LIKE 'James%'

AND temp_people1.cur_id=l_from

AND l_to=y_id

GROUP BY cur_title

:Sample output: Wikipedia:People by year/Reports/Year from article text.

:The result will need to be checked manually.

Checking sort keys

The bot creates a default sort key with the last part of the article title.

This default is not appropriate for:

As some names have been Westernized, it's not necessarily true for all.

For the articles that have already been assigned sort keys, the one assigned by the bot can be compared with the ones added by other users.

The query may turn up more sort keys that need to be fixed in other categories than in Births/Deaths by year. Categories added through template will have a sort key as the page title, the query excludes them. Subcategories in :Category:Families are likely to be sorted by first name and need to be ignored as well. Check also: :Category:People of the Vietnam War, :Category:Icelandic politicians.

SELECT CONCAT('', cur_title, ' ',

'', cl1.cl_to ,' ',

'', cl1.cl_sortkey,

'', cl2.cl_to, ' ',

'', cl2.cl_sortkey) AS CompTable

INTO OUTFILE 'wp_sortkeytest.txt'

FROM categorylinks AS cl1, categorylinks AS cl2, cur

WHERE (cl1.cl_to LIKE "%births" OR cl1.cl_to LIKE "%deaths")

AND cl1.cl_from = cl2.cl_from

AND cl1.cl_to <> cl2.cl_to

AND cl1.cl_sortkey <> cl2.cl_sortkey

AND cl1.cl_from = cur_id

  1. ignore categories added through templates

AND cl2.cl_to <> 'People_stubs'

AND cl2.cl_to <> 'Writer_stubs'

AND cl2.cl_to <> 'Language_stubs'

AND cl2.cl_to <> '1911_Britannica'

AND cl2.cl_to <> 'NPOV_disputes'

AND cl2.cl_to <> 'Unformatted_ice_hockey_player'

AND cl2.cl_to <> 'Substubs'

AND cl2.cl_to <> 'Articles_to_be_split'

AND cl2.cl_to <> 'Cleanup'

AND cl2.cl_to <> 'Pages_on_votes_for_deletion'

  1. ignore family cats, e.g.

AND cl2.cl_to <> 'The_Rockefellers'

AND cl2.cl_to <> 'The_Rothschilds'

  1. ignore categories with sortkey "*", e.g. for John Lennon in :Category:John Lennon

AND LEFT(cl2.cl_sortkey,1)<>'*'

  1. ignore differences beyond the first 4 char.

AND LEFT(cl1.cl_sortkey, 4)<>LEFT(cl2.cl_sortkey, 4)

ORDER BY cur_id

:Output: Wikipedia:People by year/Reports/Sortkeytest

In these cases, the sort keys need be edited manually (for now).

See also: /Reports/Sortkeytest2 made with /Reports/Sortkeytest2/SQL

Bot problems

See also Wikipedia:bots for general precautions regarding the use of bots.

The birth/death year category uploaded by bot is not correct.

  • Please correct it.
  • If the bot note reads "Based on List of people by name", the year there is being used. It's likely that:
  • a disambiguating link/page is necessary, the years used being those of another person.
  • another page/list gives incorrect years. Please update them there as well.
  • If no source is given, the manual check of the input wasn't accurate.

Add the sample to this page if the type of problem hasn't been identified.

The sort key is not correct.

  • If there is no other category assigned to the article, please correct it. If the category of names isn't identified yet, please mention it above.
  • If the article has already another category, the sort key may eventually be fixed to match it.
  • In more recent additions, the sort key is based on existing categories (e.g. Peers, but obviously not People stubs). Thus other existing categories may need be corrected as well.

There are multiple categories for birth/death

These are generally due to conflicting sortkeys or sources used for the years. Uploads are now being checked based on the last to avoid that categories are added to articles already categorized, and past uploaded manually corrected. Please remove duplicated categories that may still exist.

  • All these articles are identified and manually checked based on the last available database download:
  • /Reports/Multiple cats (built with: /Reports/Multiple cats/SQL
  • Articles with different years are identified with a separate report, e.g. Hans Richter, [http://en.wikipedia.org/w/wiki.phtml?title=Sam_Jaffe&oldid=6180241 Sam Jaffe (former version)]. If it's likely that these are eventually being split, the categories are left on the page.

Sample queries

=People table=

Builds a table with article title, years, age, other categories, etc.

See: /SQL for table

=All=

Select all biography articles (in Births by year or Deaths by year):

SELECT DISTINCT CONCAT('#',cur_title,'') LIST

INTO OUTFILE 'wp_people_by_year_all.txt'

FROM cur, categorylinks

WHERE (cl_to LIKE '%deaths' OR cl_to LIKE '%births')

AND cl_from=cur_id

AND cl_sortkey NOT LIKE '*%'

ORDER BY cl_sortkey

LIMIT 10000

:Sample output: Wikipedia:People by year/Reports/All

=All with years=

With birth and death years:

SELECT CONCAT('*', REPLACE(cur_title,'_',' '), ' (', LEFT(cl1.cl_to,4) ,'', LEFT(cl2.cl_to,4), ')') AS CompTable

INTO OUTFILE 'wp_name_(born-died).txt' #add directory/path

FROM categorylinks AS cl1, categorylinks AS cl2, cur

WHERE cl1.cl_to LIKE "%births"

AND cl1.cl_from = cl2.cl_from

AND cl2.cl_to LIKE "%deaths"

AND cl1.cl_from = cur.cur_id

ORDER BY cl1.cl_sortkey

:Sample output: Wikipedia:People by year/Reports/Name (born-died)

=With categories=

With birth and death years and other category assigned to the article:

..

=Oldest/youngest=

Oldest persons with biographies in Wikipedia:

SELECT CONCAT('*', REPLACE(cur_title,'_',' '), ' ', (cl2.cl_to - cl1.cl_to), ' (', LEFT(cl1.cl_to,4) ,'', LEFT(cl2.cl_to,4), ')') AS CompTable

INTO OUTFILE 'wp_oldest_(born-died).txt' #add directory/path

FROM categorylinks AS cl1, categorylinks AS cl2, cur

WHERE cl1.cl_to LIKE "%births"

AND cl1.cl_from = cl2.cl_from

AND cl2.cl_to LIKE "%deaths"

AND cl1.cl_from = cur.cur_id

ORDER BY (cl2.cl_to - cl1.cl_to) DESC

LIMIT 10

:Sample output: Wikipedia:People by year/Reports/Oldest

:Similar: Wikipedia:People by year/Reports/Youngest

=Per decade=

Biographies available for people alive in a given decade:

..

Maintenance

=Disambiguation pages=

Disambiguation pages with year categories

SELECT DISTINCT CONCAT('#', REPLACE(p_title,'_',' '), '')

INTO OUTFILE 'wp_disambig_pages.txt'

FROM temp_peopleyr, categorylinks

WHERE p_id=cl_from

AND cl_to = 'Disambiguation'

:Output: /Reports/Disambig_pages

=Articles to be categorized=

Articles in a subcategory of :Category:People, but without year of birth/death category:

..

=No other categories=

Articles in no other subcategory of :Category:People, than birth/death:

SELECT DISTINCT CONCAT('*', REPLACE(p_title, '_', ' '), ' ',

IF(y2='0000', CONCAT('(born ', y1,')'), IF(y1='0000', CONCAT('(died ', y2,')'), CONCAT('(', y1, '-',y2,')'))),

IF(p_categories=, , CONCAT(', ', p_categories))

)INTO OUTFILE 'wp_no_other_cat.txt'

FROM temp_peopleyr

WHERE p_cats='00'

ORDER BY p_sortkey

LIMIT 20000

:Output: /Reports/No_other_categories

=Multiple years=

Articles with multiple (and different) birth or death categories:

SELECT p_title, p_id, RIGHT(cl_to, 6), Count(*)

FROM temp_peopleyr, categorylinks

WHERE

p_id=cl_from

AND cl_to LIKE '%births' # or deaths instead

GROUP BY p_title

LIMIT 100000

:Output: (as of October 3 2004) Hans Richter

This does not identify articles with twice the same year of birth (or death) category.

=Other=

  • Missing year of birth

{{Wikipedia categorization navbox}}

People by year, Wikipedia