User:Monkbot/Task 6: CS1 language support

Monkbot task 6 was created to modify CS1 citations that have {{para|title}} parameters containing non-Latin to use the new CS1 parameter {{para|script-title}}.

__TOC__

A recent change to Module:Citation/CS1 (the engine underlying the {{cs1}} templates) created a new parameter {{para|script-title}}. The new parameter is intended to be used when a citation's title is written in a script that is not a Latin-based alphabet. Usually these scripts should not be italicized (Chinese, Japanese, etc.) and/or may be written right-to-left (Hebrew, Persian, etc.). {{para|script-title}} is supported by all citation templates that use Module:Citation/CS1 except {{tlx|cite encyclopedia}}. As of revision b, task 6 does not modify {{tld|cite encyclopedia}} templates.

The purpose of the {{tld|xx icon}} templates is to identify for readers that certain links are to sources that are not English language sources. Each of these {{tld|xx icon}} templates adds the page to the appropriate subcategory of {{cl|Articles with non-English-language external links}}. Prior to the 11 October 2014 update to Module:Citation/CS1, CS1 templates with {{para|language}} parameters also added pages to the individual subcategories in Category:Articles with non-English-language external links. Because CS1 citations do not always provide links to external sources, citations that used {{para|language}} to identify the language in which the source is written were improperly categorizing the article. Module:Citation/CS1 now uses {{cl|CS1 foreign language sources}}. Task 6 locates CS1 citation templates that are adjacent to {{tld|xx icon}} templates, adds a {{para|language}} parameter with the language code from the {{tld|xx icon}} template to the CS1 citation and then deletes the {{tld|xx icon}} template.

Task 6 was initially created to work on pages listed in certain subcategories of Category:Articles with non-English-language external links. The criteria are: subcategories that contain 1,000 or more articles; or subcategories for languages that have a ISO639-1 two-character language code that are listed at right-to-left. The first was an arbitrary cutoff, the second was not.

Task 6 begins by changing {{tld|xx icon}} redirects to that standard form. For example, {{tlx|Da}}, {{tlx|Da li}}, {{tlx|Da-icon}}, and {{tlx|Dk icon}} are all redirects to and so are changed to {{tlx|da icon}}. The purpose of the standardization is to simplify later rules in the script.

After {{tld|xx icon}} standardization, task 6:

protects certain {{tld|xx icon}} templates from further edits;
moves {{tld|xx icon}} templates that are inside a CS1 citation template to a position ahead of the CS1 template for processing by later rules;
removes empty {{para|language}} parameters from CS1 citations so that the citation doesn't end up with duplicate {{para|language}} parameters at the end of the task;
removes wikilink markup from {{para|language}} parameter values so that Module:Citation/CS1 can properly categorize the citation;
~~removes {{para|language|English}}, {{para|language|British English}}, {{para|language|en}}, or {{para|language|en-GB}} from CS1 citations that use them.~~ discontinued at task 6n;
from task 6n: modifies {{para|language|English language}}, {{para|language|British English}} to {{para|language|English}}; modifies {{para|language|en-GB}} to {{para|language|en}}

Some citations have {{para|language}} parameters that contain RFC1766-style language codes (code-subcode where code is an ISO639-1 language code and subcode is an ISO3166 country code. CS1 does not support this style of language parameter. Task 6 truncates these codes to just the ISO639-1 portion. Chinese is written in both simplified and traditional forms. Where {{para|language|simplified Chinese}} or {{para|language|traditional Chinese}} parameters occur, task 6 removes the qualifier. Where {{para|language}} contains a language name followed by the word language ({{para|language}German language}}), task 6 removes the qualifier.

In a CS1 citation, {{para|language}} may either precede or follow {{para|title}} with or without intervening parameters. To properly evaluate each citation then requires a rule for each case. Alternately, multiple rules are not needed if each citation is modified to a standard format. In this case, editors generally place {{para|language}} somewhere after {{para|title}}. Task 6 modifies those citation templates where {{para|language}} precedes {{para|title}} by moving {{para|language}} to the end of the citation (same place it puts {{para|language}} parameters that are created from {{tld|xx icon}} templates).

Certain citations shouldn't be edited. Task 6 employs a multilevel protection scheme. Edits to protected elements are prevented by the insertion of a special text string that makes the template unrecognizable to subsequent rules. Elements that include either of the special text strings __PROTECTED__ and __PROTECTED2__, are never edited by task 6 except to remove the protection string at the task's completion. Reasons for this level of protection are:

a citation with leading or trailing {{tld|xx icon}} templates contains {{para|language|}} where the {{tld|xx icon}} code (xx) or the code's equivalent language name does not match the language name or code in {{para|language}}; where there is a match, {{tld|xx icon}} is removed;
the citation includes another template; especially templates like {{tlx|nihongo}} which can confuse the later rules;
groups of two or more {{tld|xx icon}} or {{tld|xxx icon}} templates, the first and last are protected to prevent later rules from taking one of them as a value for a citation's {{para|language}} parameter.
{{tlx|en icon}} when amongst other {{tld|xx icon}} or {{tld|xxx icon}} templates; it is presumed that such use indicates a multilingual source;

The second level of protection is applied only after the first level protection rules have been applied. This level identifies CS1 citations that have {{para|title}} values containing one or more Latin characters. The script is not smart enough to know if these characters are part of the original writing system, are a transliteration, or are a translation. Under certain circumstances described later, task 6 may edit those citations marked with __PROTECED1__.

Unprotected {{tlx|en icon}} templates are then deleted.

For each of the rtl languages, the CJK languages, other non-Latin scripts (Greek, Hebrew, Cyrillic), and in keeping with MOS:Foriegn terms, special rules require that the content of {{para|title}} must match the language identified in {{tld|xx icon}} or {{para|language}}. For example, the rule for Arabic requires an {{tlx|ar icon}} or {{para|language|ar}} or {{para|language|Arabic}} and that {{para|title}} contain only punctuation, digits (0–9), and Arabic script. When these conditions are met, task 6 replaces {{para|title|...}} with {{para|script-title|ar:...}}, adds {{para|language|ar}} (if appropriate) and deletes the adjacent {{tld|ar icon}} template (if present).

Languages for which task 6 supports {{para|script-title}} are:

{{columns-list |colwidth=15em|

Arabic (ar)
Armenian (hy)
Bosnian (bs)
Chinese (zh)
Greek (el)
Hebrew (he)
Japanese (ja)
Korean (ko)
Kurdish (ku)
Maldivian (dv){{dagger}}
Pashto (ps)
Persian (fa)
Russian (ru)
Serbian (sr)
Sindhi (sd)
Thai (th)
Ukranian (uk)
Uyghur (ug)
Yiddish (yi)

}}

{{dagger}} {{small|when {{para|language|divehi}}, {{para|language|dhivehi}}, {{para|language|maldivian}}, {{para|language|dv}}; when citation has adjacent {{tlx|dv icon}}, {{para|language}} parameter must be {{para|language|Maldivian}} or {{para|language|dv}};}}

For those languages that use Latin or Latin-variant alphabets, task 6 simply adds {{para|language|xx}} and deletes the adjacent {{tld|xx icon}} template.

Where those CS1 citations with Latin characters in {{para|title}}, and which now contain __PROTECTED1__, task 6 deletes the icon and adds {{para|language|xx}} to the citation.

As a final step, wherever task 6 added __PROTECTED__, __PROTECTED1__, and __PROTECTED2__, that text is removed.

From 18 April 2015‎ Module:Citation/CS1 supports a comma delimited list of language names. From Rev. o, task 6 will locate cs1|2 templates followed by two to five {{tld|xx icon}} templates and add the codes from those template to a {{para|language}} parameter.

Hidden under the hood at Module:Citation/CS1 is the process that takes {{para|title|transcription}}, {{para|script-title|xx:original writing system title}}, and {{para|trans-title|translated title}} and puts them all together with {{tag|bdi|params=lang="xx"}} which both isolates the content for rtl languages and helps the browser to correctly display the script.

If, at the end of all of this, only casing has been changed ({{tld|XX icon}} to {{tld|xx icon}}) then the change is not saved.

Article pages that contain {{tlx|bots|Monkbot 6}} or that do not contain Module:Citation/CS1-supported templates will not be edited by this task.

Ancillary tasks

This script also:

To do list

Script

// REVISIONS:

// 2014-11-13: Rev a:

// Detect and remove |language=British English

// |language=divehi or |language=dhivehi or |language=maldivian or |language=dv

// 2014-11-14: Rev b:

// remove support for cite encyclopedia; parameter remapping in Module:Citation/CS1 doesn't work because no |script-chapter

// 2014-11-14: Rev c:

// add support for Armenian (hy);

// 2014-11-15: Rev d:

// Mandarin and Cantonese dialects to Chinese; standard Chinese to Chinese;

// 2014-11-16: Rev e:

// Revise protection rule so CS1 templates with embedded templates are more correctly ignored;

// 2014-11-17: Rev f:

// Modify |language=Nynorsk to |language=Norwegian Nynorsk;

// 2014-11-17: Rev g:

// Add rule to remove empty |script-title= already in a citation;

// 2014-11-18: Rev h:

// Modify |language=Bokmål to |language=Norwegian Bokmål;

// 2014-11-18: Rev i:

// Modify |language=Português to |language=Portuguese;

// 2014-11-18: Rev j:

// Remove |language=English language;

// 2014-11-18: Rev k:

// Add rule to search previously edited pages for erroneous edits that may have placed |language=xx at the end of an embedded template; Use Category:CS1 uses foreign language script;

// 2015-04-26: Rev l:

// expand the number of rules that can use IS_CS1E; add cite arxiv, cite map, cite episode, cite serial;

// 2015-04-27: Rev m:

// remove support for cite episode; parameter remapping in Module:Citation/CS1 doesn't work because no |script-chapter

// 2015-08-26: Rev n:

// change variants of |language=english because the module now simply hides english annotation;

// 2015-08-28: Rev o:

// add multi-icon to language parameter; enable newsgroup and newspaper;

// 2019-06-10: Rev p:

// allow IETF-like language tags because cs1|2 accepts them

public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip)

{

Skip = true;

// Summary = "add |script-title=; replace {{xx icon}} with |language= in CS1 citations; clean up language icons;";

Summary = "Task 6p: add |script-title=; replace {{xx icon}} with |language= in CS1 citations; normalize language icons;";

string pattern; // local variable to hold regex pattern for reuse

string IS_CJK = @"\p{IsHangulSyllables}\p{IsCJKUnifiedIdeographs}\p{IsHalfwidthandFullwidthForms}\p{IsCJKSymbolsandPunctuation}\p{IsHiragana}\p{IsKatakana}";

string IS_DIGITS_AND_SYMBOLS = @"\d\p{P}~\$\^\+`\=\|\<\>";

string IS_ARABIC_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsArabic}]+"; // Arabic, Pashto, Uyghur

string IS_ARMENIAN_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsArmenian}]+"; // Arabic, Pashto, Uyghur

string IS_CJK_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[" + IS_CJK + @"]+"; // Chinese, Japanese, Korean

string IS_CYRILLIC_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsCyrillic}\p{IsCyrillicSupplement}]+"; // Bosnian, Russian, Serbian, Ukrainian

string IS_GREEK_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsGreek}]+"; // Greek

string IS_HEBREW_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsHebrew}]+"; // Hebrew, Yiddish

string IS_PERSIAN_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsArabic}\p{IsHebrew}\p{IsCyrillic}\p{IsCyrillicSupplement}]+"; // Persian

string IS_SINDHI_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsArabic}\p{IsDevanagari}]+"; // Sindhi

string IS_THAANA_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsThaana}]+"; // Maldivian

string IS_THAI_SCRIPT = @"[" + IS_DIGITS_AND_SYMBOLS + @"]*[\p{IsThai}]+"; // Thai

Dictionary language_map = new Dictionary();

language_map.Add("ar", "arabic"); // Arabic

language_map.Add("bs", "bosnian"); // Cyrillic

language_map.Add("ca", "catalan");

language_map.Add("cs", "czech");

language_map.Add("da", "danish");

language_map.Add("de", "german");

language_map.Add("dv", "maldivian"); // TODO: do special case for this? mediawiki doesn't recognize malvidian nor dhivehi but does recognize divehi

language_map.Add("el", "greek"); // Greek

language_map.Add("es", "spanish");

language_map.Add("fa", "persian"); // Arabic, Cyrillic, Hebrew

language_map.Add("fi", "finnish");

language_map.Add("fr", "french");

language_map.Add("he", "hebrew");

language_map.Add("hr", "croatian");

language_map.Add("hu", "hungarian");

language_map.Add("hy", "armenian");

language_map.Add("id", "indonesian");

language_map.Add("it", "italian");

language_map.Add("ja", "japanese");

language_map.Add("ko", "korean");

language_map.Add("ku", "kurdish");

language_map.Add("lt", "lithuanian");

language_map.Add("nl", "dutch");

language_map.Add("no", "norwegian");

language_map.Add("pl", "polish");

language_map.Add("ps", "pashto"); // Arabic*

language_map.Add("pt", "portuguese");

language_map.Add("ro", "romanian");

language_map.Add("ru", "russian"); // Cyrillic*

language_map.Add("sd", "sindhi"); // Arabic, Devanagari

language_map.Add("sk", "slovak");

language_map.Add("sl", "slovenian");

language_map.Add("sr", "serbian"); // Cyrillic

language_map.Add("sv", "swedish");

language_map.Add("th", "thai");

language_map.Add("tr", "turkish");

language_map.Add("ug", "uyghur"); // Arabic

language_map.Add("uk", "ukrainian"); // Cyrillic

language_map.Add("yi", "yiddish"); // Hebrew

language_map.Add("zh", "chinese");

Dictionary spelling_map = new Dictionary();

spelling_map.Add("Belorussian", "Belarusian");

spelling_map.Add("Castilan", "Spanish");

spelling_map.Add("Germaan", "German");

spelling_map.Add("Norwegain", "Norwegian");

spelling_map.Add("Portuguese (Brazil)", "Portuguese");

//---------------------------< R E P L A C E R E D I R E C T S >--------------------------------------------

// ARABIC: Replace {{AR}}, {{AR icon}} with {{ar icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:AR|(?:AR|[Aa]r) icon)\}\}", "{{ar icon}}");

// CATALAN: Replace {{Ca}}, {{Ca li}} with {{ca icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Cc]a|[Cc]a li|Ca icon)\}\}", "{{ca icon}}");

// CHINESE: Replace {{cn icon}}, {{zh-icon}} with {{zh icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Cc]n icon|[Zz]h[ \-]icon)\}\}", "{{zh icon}}");

// CROATIAN: Replace {{Hr li}} with {{hr icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Hh]r li|Hr icon)\}\}", "{{hr icon}}");

// CZECH: Replace {{Cs li}}, {{Cz}}, {{Cz icon}} with {{cs icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Cc]s li|[Cc]z|[Cc]z icon|Cs icon)\}\}", "{{cs icon}}");

// DANISH: Replace {{Da}}, {{Da li}}, {{Da-icon}}, {{Dk icon}} with {{da icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Dd]a|[Dd]a li|[Dd]a[ \-]icon|[Dd]k icon)\}\}", "{{da icon}}");

// ENGLISH: Replace {{En li}}, {{En-icon}}, {{Ref-en}}

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ee]n icon|[Ee]n li|[Ee]n\-icon|[Rr]ef-en)\}\}", "{{en icon}}");

// FINNISH: Replace {{Fi}}, {{Fi li}} with {{fi icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ff]i|[Ff]i li|Fi icon)\}\}", "{{fi icon}}");

// FRENCH: Replace {{Fr icon}}, {{Fr}}, {{fr}}, {{French icon}}, {{FR-icon}}, {{Fr li}}, {{Fr-icon}}, {{Ref-fr}} with {{fr icon}}. {{FR}} is a redirect to {{FRA}}, a flag template

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ff]r icon|[Ff]r|[Ff]rench icon|FR-icon|[Ff]r li|[Rr]ef-fr)\}\}", "{{fr icon}}");

// GERMAN: Replace {{De li}}, {{De-icon}}, {{Ger}}, {{ger}}, {{Icon de}}, {{Ref-de}} with {{de icon}}. {{GER}} is a redirect to {{DEU}}, a flag template

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Dd]e li|[Dd]e[ \-]icon|[Gg]er|[Ii]con de|[Rr]ef\-de)\}\}", "{{de icon}}");

// GREEK: Replace {{El}}, {{el}}, {{El icon}}, {{Gr icon}}, {{Gre icon}} with {{el icon}}. {{EL}} is a redirect to {{External links}}, a maintenance template

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ee]l|[Ee]l icon|[Gg]r icon|[Gg]re icon)\}\}", "{{el icon}}");

// HUNGARIAN: Replace {{Hu}}, {{Hu li}}, {{Ref-hu}} with {{hu icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Hh]u|[Hh]u li|[Rr]ef\-hu|Hu icon)\}\}", "{{hu icon}}");

// INDONESIAN: Replace {{Id}}, {{Id li}}, {{Indonesian}}, {{Indonesian icon}} with {{id icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ii]d|[Ii]d li|[Ii]ndonesian|[Ii]ndonesian icon|Id icon)\}\}", "{{id icon}}");

// ITALIAN: Replace {{It li}}, {{It}} with {{it icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ii]t li|[Ii]t|It icon)\}\}", "{{it icon}}");

// JAPANESE: Replace {{Jp-icon}}, {{Ja}}, {{Ja li}}, {{Ja-icon}}, {{Jp icon}}, {{Jp language}} with {{ja icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Jj]a icon|[Jj]p\-icon|[Jj]a|[Jj]a li|[Jj]a\-icon|[Jj]p icon|[Jj]p language)\}\}", "{{ja icon}}");

// KOREAN: Replace {{Ko}} with {{ko icon}}. {{KO}} is a used for something else

ArticleText = Regex.Replace (ArticleText, @"\{\{[Kk]o(?: icon)?\}\}", "{{ko icon}}");

// LITHUANIAN: Replace {{Lt li}}, {{Lticon}} with {{lt icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ll]t li|[Ll]ticon|Lt icon)\}\}", "{{lt icon}}");

// DUTCH (NETHERLANDS): Replace {{Du icon}}, {{Nl}}, {{Nl li}}, {{Nl-icon}} with {{nl icon}}. {{NL}} is used as a flag template

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Dd]u icon|[Nn]l|[Nn]l li|[Nn]l[ \-]icon)\}\}", "{{nl icon}}");

// NORWEGIAN: Replace {{No-icon}} with {{no icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{[Nn]o[ \-]icon\}\}", "{{no icon}}");

// PERSIAN: Replace {{fa}} and {{pr icon}} with {{fa icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ff]a|[Pp]r icon|Fa icon)\}\}", "{{fa icon}}");

// POLISH: Replace {{Pl}}, {{pl}}, {{Pl li}}, {{Pl-icon}} with {{pl icon}}. {{PL}} is a redirect to Plainlist

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Pp]l|[Pp]l li|[Pp]l[ \-]icon)\}\}", "{{pl icon}}");

// PORTUGUESE: Replace {{Pt}}, {{Pt li}} with {{pt icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Pp]t|[Pp]t li|Pt icon)\}\}", "{{pt icon}}");

// ROMANIAN: Replace {{Ref-ro}}, {{Ro}}, {{Ro li}}, {{Ro-icon}} with {{ro icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Rr]ef-ro|[Rr]o|[Rr]o li|[Rr]o[ \-]icon)\}\}", "{{ro icon}}");

// RUSSIAN: Replace {{Ru li}}, {{Icon ru}}, {{Ref-ru}}, {{Ru Icon}}, {{Ru language}}, {{Ru-icon}} with {{ru icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Rr]u li|[Ii]con ru|[Rr]ef-ru|[Rr]u Icon|[Rr]u language|[Rr]u-icon)\}\}", "{{ru icon}}");

// SERBIAN: Replace {{SR icon}}, {{Sr li}} with {{sr icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:(?:[Ss]r|SR) icon|[Ss]r li)\}\}", "{{sr icon}}");

// SINDHI: Replace {{Sd}} with {{sd icon}}. {{SD}} is a speedy delete template

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ss]d|Sd icon)\}\}", "{{sd icon}}");

// SLOVAK: Replace {{Sk}} with {{sk icon}}. {{SK}} is a flag template

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ss]k|Sk icon)\}\}", "{{sk icon}}");

// SLOVENIAN: Replace {{Sl}}, {{sl}}, {{Sl li}}, {{Slovene}} with {{sl icon}}. {{SL}} is a redirect to Subscription or libraries template

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ss]l|[Ss]l li|[Ss]lovene|Sl icon)\}\}", "{{sl icon}}");

// SPANISH: Replace {{Es-icon}}, ((Sp icon}}, {{Es}}, {{Es li}} with {{es icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ee]s[ \-]icon|[Ss]p icon|[Ee]s|[Ee]s li)\}\}", "{{es icon}}");

// SWEDISH: Replace {{Sv}}, {{sv}}, {{Svenska}}, {{Svicon}}, {{Swe icon}} with {{sv icon}}. {{SV}} is a ship prefix template

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Ss]v|[Ss]venska|[Ss]vicon|[Ss]we icon|Sv icon)\}\}", "{{sv icon}}");

// THAI: Replace {{Th icon}} with {{th icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{Th icon\}\}", "{{th icon}}");

// TURKISH: Replace {{TR}}, {{Tr}}, {{Tr li}} with {{tr icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:TR|[Tt]r|Tr icon|Tr li)\}\}", "{{tr icon}}");

// UKRANIAN: Replace {{Uk li}}, {{Ref-uk}}, {{Ua icon}} with {{uk icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Uu]k li|[Rr]ef-uk|[Uu]a icon|Uk icon)\}\}", "{{uk icon}}");

// YIDDISH: Replace {{Yi}} with {{yi icon}}.

ArticleText = Regex.Replace (ArticleText, @"\{\{(?:[Yy]i|Yi icon)\}\}", "{{yi icon}}");

// OTHERS: Replace these redirects for completeness: {{Ref-az}}, {{Ref-be}}, {{Ref-hy}}, {{Ref-uz}}

ArticleText = Regex.Replace (ArticleText, @"\{\{[Rr]ef-((?:az|be|hy|uz))\}\}", "{{$1 icon}}");

// OTHERS: Replace these redirects for completeness:

// {{Af li}}, {{Ba li}}, {{Be li}}, {{Bg li}}, {{Br li}}, {{Et li}}, {{Eu li}}, {{Ga li}}, {{Gd li}},

// {{Gn li}},{{Is li}}, {{Ka li}}, {{Ln li}}, {{Mg li}}, {{Ms li}}, {{Qu li}}, {{Tl li}}, {{Vi li}}

ArticleText = Regex.Replace (ArticleText, @"\{\{((?:[Aa]f|[Bb]a|[Bb]e|[Bb]g|[Bb]r|[Ee]t|[Ee]u|[Gg]a|[Gg]d|[Gg]n|[Ii]s|[Kk]a|[Ll]n|[Mm]g|[Mm]s|[Qq]u|[Tt]l|[Vv]i)) li\}\}",

delegate(Match match)

{

return @"{{" + match.Groups[1].Value.ToLower() + @" icon}}"; // set language code portion to lower case

});

// OTHERS: set mixed and upper case codes in {{xx icon}} templates to lower case for completeness also remove hyphens: {{Xx-icon}} and {{XX-icon}} to {{xx icon}}

ArticleText = Regex.Replace (ArticleText, @"\{\{([a-zA-Z]{2})[\s-]icon\}\}",

delegate(Match match)

{

return @"{{" + match.Groups[1].Value.ToLower() + @" icon}}"; // set language code portion to lower case

});

//-------------------------------------------------------------------------------

// these rules support ISO639-2, 3, etc three-character codes: {{xxx icon}}

// ICON GROUPS: Protect {{xx icons}} when there are two of them separated by ' and ': {{xx icon}} and {{xx icon}} is changed to:

// {{__PROTECTED__xx icon}} and {{xx icon__PROTECTED__}}

// This rule prevents later rules from moving the first or last of an icon group into |language=

ArticleText = Regex.Replace(ArticleText, @"(\{\{)([a-z]{2,3}\s*icon\}\}\s*and\s*\{\{[a-z]{2,3}[\s-]icon)(\}\})", "$1__PROTECTED__$2__PROTECTED__$3");

// ICON GROUPS: Protect {{xx icons}} when there are multiples of them: {{xx icon}} {{xx icon}} {{xx icon}} is changed to:

// {{__PROTECTED__xx icon}} {{xx icon}} {{xx icon__PROTECTED__}}

// This rule prevents later rules from moving the first or last of an icon group into |language=

ArticleText = Regex.Replace(ArticleText, @"(\{\{)([a-z]{2,3}\s*icon\}\}(?:\s*[,;/–-]?\s*&?\s*\{\{[a-z]{2,3}[\s-]icon\}\})*\s*[,;/–-]?\s*&?\s*\{\{[a-z]{2,3}[\s-]icon)(\}\})", "$1__PROTECTED__$2__PROTECTED__$3");

// ENGLISH ICON: Protect {{en icon}} when it is in a group of icons but is not one of the end icons

// This rule prevents the delete {{en icon}} rule from deleting {{en icon}} when it is a member of a group of icons. When in a group,

// if {{en icon}} is not one of the end icons, it always follows another so a rule for an {{en icon}} preceding {{xx icon}} is not necessary.

ArticleText = Regex.Replace(ArticleText, @"([a-z]{2,3}\s*icon\}\}\s*[,;/–-]?\s*\{\{)(en\s*icon\}\})", "$1__PROTECTED__$2");

//---------------------------< R E M O V A L S >--------------------------------------------------------------

// INSIDE ICONS: Find {{xx icon}} templates inside a CS1 citation template. Move {{xx icon}} ahead of the citation so it can be processed by later rules

// Doesn't find inside {{xx icon}} templates if the citation also has other templates ahead of {{xx icon}}

pattern = @"(\{\{\s*" + IS_CS1E + @"[^\{\}]+)(\{\{\w{2,2}\s*icon\s*\}\})";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$2$1");

Skip = false;

}

// LANGUAGE MAGIC WORDS: Find {{#language:xx|xx}} magic words inside a CS1 citation template. Remove all but language code. Assume associated with |language=

// Doesn't find inside {{#language:xx}} if the citation also has other templates ahead of {{#language:xx}}

pattern = @"(\{\{\s*" + IS_CS1E + @"[^\{\}]*\|\s*language\s*=\s*)\{\{#language:([a-zA-Z]{2})[^\}]*\}\}";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1$2");

Skip = false;

}

// EMPTY PARAMETERS: Remove empty |language= parameters so we don't end up with two. This rule follows the INSIDE ICONS rule so that newly emptied |language= is removed.

ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_CS1E + @"[^\}]*)\|\s*language\s*=\s*([\|\}])", "$1$2");

// EMPTY PARAMETERS: Remove empty |script-title= parameters so we don't end up with two.

ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_CS1 + @"[^\}]*)\|\s*script-title\s*=\s*([\|\}])", "$1$2");

// WIKILINKS: Remove simple wikilinks from |language parameters because they prevent proper categorization

// Replace Text with Text

pattern = @"(\{\{\s*" +IS_CS1E + @"[^\}]*\|\s*language\s*=\s*)\[\[([A-Za-z\s]+)\]\]";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1$2");

Skip = false;

}

// WIKILINKS: Remove complex wikilinks from |language parameters because they prevent proper categorization

// Replace Text with Text

pattern = @"(\{\{\s*" +IS_CS1E + @"[^\}]*\|\s*language\s*=\s*)\[\[[A-Za-z\s]+\|([A-Za-z\s]+)\]\]";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1$2");

Skip = false;

}

// this rule disabled and replaced by the next rules because the module simply hides english annotation

// pattern = @"({{\s*" + IS_CS1E + @"[^}]+)\|\s*language\s*=\s*(?:[Ee]nglish|[Bb]ritish [Ee]nglish|en\-[a-zA-Z]+|EN|[Ee]ng?)\s*([\|\}])";

// if (Regex.Match (ArticleText, pattern).Success)

// {

// ArticleText = Regex.Replace(ArticleText, pattern, "$1$2");

// Skip = false;

// }

// ENGLISH: Replace |language=en-XX with en

// disabled 2019-06-10 because cs1|2 ignores everything after the language code in IETF-like tags

// pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*en)\-[a-zA-Z]+\s*([\|\}])";

// if (Regex.Match (ArticleText, pattern).Success)

// {

// ArticleText = Regex.Replace(ArticleText, pattern, "$1$2");

// Skip = false;

// }

// ENGLISH: Replace |language=Eng with en

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)[Ee]ng\.?(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1en$2");

// Skip = false; // not sufficient change to save an article

}

// ENGLISH: Replace |language=British English with English.

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)[Bb]ritish [Ee]nglish(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1English$2");

Skip = false;

}

// this rule disabled and replaced with next rule because the module simply hides english

// ENGLISH: Remove |language=English language

// pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)[Ee]nglish\s*[Ll]anguage(\s*[\|\}])";

// if (Regex.Match (ArticleText, pattern).Success)

// {

// ArticleText = Regex.Replace(ArticleText, pattern, "$1$2");

// Skip = false;

// }

// ENGLISH: Replace |language=English language with English.

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*[Ee]nglish)\s*[Ll]anguage\s*([\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1$2$3");

Skip = false;

}

//---------------------------< M I S C M O D I F I C A T I O N S >------------------------------------------

// SUBCODES: Change |language=xx-XX (language code - subcode pairs) to |language=xx

// disabled 2019-06-10 because cs1|2 ignores everything after the language code in IETF-like tags

// pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)([a-zA-Z]{2})\-[a-zA-Z]+(\s*[\|\}])";

// if (Regex.Match (ArticleText, pattern).Success)

// {

// ArticleText = Regex.Replace(ArticleText, pattern, "$1$2$3");

// Skip = false;

// }

// CHINESE: Change |language=simplified (or standard or traditional) Chinese to |language=Chinese

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)(?:[Ss]implified|[Ss]tandard|[Tt]raditional)\s*Chinese(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1Chinese$2");

Skip = false;

}

// CHINESE: Change |language=traditional Chinese to |language=Chinese

// pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)[Tt]raditional\s*Chinese(\s*[\|\}])";

// if (Regex.Match (ArticleText, pattern).Success)

// {

// ArticleText = Regex.Replace(ArticleText, pattern, "$1Chinese$2");

// Skip = false;

// }

// CHINESE: Change |language=Mandarin and |language=Cantonese (dialects) to |language=Chinese

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)(?:[Cc]antonese|[Mm]andarin)(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1Chinese$2");

Skip = false;

}

// JAPANESE: Change |language=Japan to |language=Japanese

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)[Jj]apan(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1Japanese$2");

Skip = false;

}

// JAPANESE: Change |language=Japanese – Shift-JIS (or other extraneous text) to |language=Japanese

// pattern = @"({{\s*" + IS_CS1 + @"[^}]+\|\s*language\s*=\s*)[Jj]apanese[^\|\}]*(\s*[\|\}])";

// if (Regex.Match (ArticleText, pattern).Success)

// {

// ArticleText = Regex.Replace(ArticleText, pattern, "$1Japanese$2");

// Skip = false;

// }

// NEDERLANDS: Change |language=Nederlands to |language=Dutch

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)(?:[Nn]ederlands|NL)(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1Dutch$2");

Skip = false;

}

// NORWEGIAN BOKMÅL: Change |language=Bokmål to |language=Norwegian Bokmål

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)[Bb]okmål(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1Norwegian Bokmål$2");

Skip = false;

}

// NORWEGIAN NYNORSK: Change |language=Nynorsk to |language=Norwegian Nynorsk

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)[Nn]ynorsk(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1Norwegian Nynorsk$2");

Skip = false;

}

// PORTUGUÊS: Change |language=Português, |language=Portugeas to |language=Portuguese

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)(?:[Pp]ortuguês|[Pp]ortugeas)(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1Portuguese$2");

Skip = false;

}

// SLOVENE: Change |language=Slovene to |language=Slovenian

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)[Ss]lovene(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1Slovenian$2");

Skip = false;

}

// OTHERS: Change |language= language to |language=

pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)([a-zA-Z]+)\s*[Ll]anguage(\s*[\|\}])";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern, "$1$2$3");

Skip = false;

}

// MISSPELLINGS: Fix misspellings in |language= where is misspelled.

/* pattern = @"({{\s*" + IS_CS1E + @"[^}]+\|\s*language\s*=\s*)([^\|\}]*)";

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern,

delegate(Match match)

{

string new_spelling;

string return_string = @"RAW_MATCH " + match.Groups[0].Value; // no misspelling, return the raw string

try // get icon code's language name from dictionary

{

new_spelling = spelling_map[match.Groups[2].Value]; // will throw an exception if misspelled language (key) is not found in dictionary

}

catch (KeyNotFoundException) // trap the exception

{

return return_string; // return the raw string

}

return @"FIXED " + match.Groups[1].Value + new_spelling;

});

Skip = false;

}

/* this worked

if (Regex.Match (ArticleText, pattern).Success)

{

ArticleText = Regex.Replace(ArticleText, pattern,

delegate(Match match)

{

string new_spelling;

string return_string = @"RAW_MATCH " + match.Groups[0].Value; // no misspelling, return the raw string

try // get icon code's language name from dictionary

{

new_spelling = spelling_map[match.Groups[2].Value]; // will throw an exception if misspelled language (key) is not found in dictionary

}

catch (KeyNotFoundException) // trap the exception

{

return return_string; // return the raw string

}

return @"FIXED " + match.Groups[1].Value + new_spelling;

});

Skip = false;

}

//-------------------------------------------------------------------------------------

// Here we protect {{xx icon}} when it is paired with a citation having |language=. If the language code

// in {{xx icon}} matches or if the code's assigned language name matches , we delete the {{xx icon}}

// as superfluous. Otherwise, we can't know which, {{xx icon}} or |language=, is correct so we protect

// {{xx icon}}. The delegate functions compare icon language code to and the code's assigned language name

// to in an attempt to find a match.

// TODO: do special case for malvidian, dhivehi, divehi? mediawiki doesn't recognize malvidian nor dhivehi but does recognize divehi

// LANGUAGE PARAMETER: Protect icons that follow citations having |language= where and {{xx icon}} don't match.

ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_CS1E + @"[^}]*\|\s*language\s*=\s*)([^\|\}\s]*)([^\}]*\}\}\s*)\{\{([a-zA-Z]{2})(\s+icon\}\})",

delegate(Match match)

{

string icon_lang;

string return_string = match.Groups[1].Value + match.Groups[2].Value + match.Groups[3].Value+ @"{{__PROTECTED2__" + match.Groups[4].Value + match.Groups[5].Value;

try // get icon code's language name from dictionary

{

icon_lang = language_map[match.Groups[4].Value]; // will throw an exception if icon code (key) is not found in dictionary

}

catch (KeyNotFoundException) // trap the exception

{

return return_string; // return a protected icon followed by the citation

}

// case insensitive string compare; compare code to code and name to name

if ((0 == String.Compare (match.Groups[2].Value, match.Groups[4].Value, true)) || (0 == String.Compare (icon_lang, match.Groups[2].Value, true)))

return match.Groups[1].Value + match.Groups[2].Value + match.Groups[3].Value; // matched so remove the icon

else

return return_string; // no match, protect the icon

});

// LANGUAGE PARAMETER: Protect icons that precede citations having |language= where and {{xx icon}} don't match.

ArticleText = Regex.Replace(ArticleText, @"\{\{([a-zA-Z]{2})\s+icon\}\}\s*(\{\{\s*" + IS_CS1E + @"[^}]*\|\s*language\s*=\s*)([^\|\}\s]*)",

delegate(Match match)

{

string icon_lang;

string return_string = @"{{__PROTECTED2__" + match.Groups[1].Value + @" icon}}" + match.Groups[2].Value + match.Groups[3].Value;

try // get icon code's language name from dictionary

{

icon_lang = language_map[match.Groups[1].Value]; // will throw an exception if icon code (key) is not found in dictionary

}

catch (KeyNotFoundException) // trap the exception

{

return return_string; // return a protected icon followed by the citation

}

// case insensitive string compare; compare code to code and name to name

if ((0 == String.Compare (match.Groups[1].Value, match.Groups[3].Value, true)) || (0 == String.Compare (icon_lang, match.Groups[3].Value, true)))

return match.Groups[2].Value + match.Groups[3].Value; // matched so remove the icon

else

return return_string; // no match, protect the icon

});

//-------------------------------------------------------------------------------------

// INCLUDED TEMPLATES: Protect any citations that contain other templates except {{xx icon}} templates. Matches any embedded template.

// By the time we get here, embedded {{xx icon}} templates that could be removed have been removed by the INSIDE ICONS rule.

ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*)(" + IS_CS1E + @"[^\{\}]*\{\{[^\}]*\}\})", "$1__PROTECTED__$2");

//-------------------------------------------------------------------------------------

// This is a semi protection. There are later rules that edit citations with __PROTECTED1__

// This rule protects citations that contain Latin characters in |title=. Titles with Latin characters might be a mix of

// some script and English which might represent original writing system plus translation and/or transliteration. Such titles

// are too complicated for simple regex fixes so are protected. Some of these |title= parameters are wrapped in tags; the

// reason why isn't clear.

ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*)(" + IS_CS1E + @"[^}]*\|\s*title\s*=(?:\s*)?[^\|\}]*(?=[a-zA-Z])[^\|\}]+)", "$1__PROTECTED1__$2");

//-------------------------------------------------------------------------------

// These rules delete unprotected English icons

// DELETE: ENGLISH: Remove {{en icon}} when not protected. This version when NOT at end of line include trailing space characters

ArticleText = Regex.Replace(ArticleText, @"\{\{(?:en icon|En li|En-icon|Ref-en)\}\} *([^\n])", "$1");

// DELETE: ENGLISH: Remove {{en icon}} when not protected. This version when at end of line; include leading and trailing space characters

ArticleText = Regex.Replace(ArticleText, @" *\{\{(?:en icon|En li|En-icon|Ref-en)\}\} *(\n)", "$1");

//-----------------------------------------------------------------

// LANGUAGE: |language= may occur ahead of |title=; when it does, move it to the end of the citation before the closing }}

// This rule saves us the trouble of creating and maintaining duplicates of some of the following rules.

ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_CS1 + @"[^\}]*)(\|\s*language\s*=\s*[^\|\}]*)([^\}]*)(\|\s*title\s*=[^\}]*)(\}\})", "$1$3$4$2$5");

//---------------------------< S C R I P T - T I T L E S >----------------------------------------------------

// These rules replace |title with an appropriate |script-title=, add the correct |language= parameter, and delete the adjacent {{xx icon}} template.

// All CS1 templates except {{cite encyclopedia}} which will require Module:Citation/CS1 support for |script-chapter=

// ARABIC and KURDISH, PASHTO, UYGHUR when written in Arabic. Find citations where |title= is in Arabic

// and the citation is followed by an {{ar icon}}, {{ku icon}}, {{ps icon}}, or {{ug icon}} template.

// Replace |title= with |script-title=xx:; add |language=xx; delete {{xx icon}} where xx is ar, ku, ps, ug. pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARABIC_SCRIPT + @")([^\}]*)(\}\})\s*\{\{((?:ar|ku|ps|ug)) icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=$5:$2$3|language=$5$4"); Skip = false; } // ARABIC and KURDISH, PASHTO, UYGHUR when written in Arabic. Find citations where |title= is in Arabic // and the citation is preceded by an {{ar icon}}, {{ku icon}}, {{ps icon}}, or {{ug icon}} template. // Replace |title= with |script-title=xx:<title>; add |language=xx; delete {{xx icon}} where xx is ar, ku, ps, ug. pattern = @"\{\{((?:ar|ku|ps|ug)) icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARABIC_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$2script-title=$1:$3$4|language=$1$5"); Skip = false; } // ARABIC, KURDISH, PASHTO, UYGHUR: Find citations where |title= is in Arabic and the citation contains |language=ar or (ku, ps, ug). // Replace |title= with |script-title=xx:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARABIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*((?:ar|ku|ps|ug)))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=$4:$2$3$5"); Skip = false; } // ARABIC: Find citations where |title= is in Arabic and the citation contains |language=Arabic. Replace |title= with |script-title=ar:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARABIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*[Aa]rabic)([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=ar:$2$3$4"); Skip = false; } // KURDISH: Find citations where |title= is in Arabic and the citation contains |language=Kurdish. Replace |title= with |script-title=ku:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARABIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*[Kk]urdish)([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=ku:$2$3$4"); Skip = false; } // PASHTO: Find citations where |title= is in Arabic and the citation contains |language=Pashto. Replace |title= with |script-title=ps:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARABIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*[Pp]ashto)([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=ps:$2$3$4"); Skip = false; } // UYGHUR: Find citations where |title= is in Arabic and the citation contains |language=Uyghur. Replace |title= with |script-title=ug:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARABIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*[Uu]yghur)([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=ug:$2$3$4"); Skip = false; } // ARMENIAN: Find citations where |title= is in Armenian and the citation is followed by {{hy icon}} template. // Replace |title= with |script-title=hy:<title>; add |language=hy; delete {{hy icon}} pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARMENIAN_SCRIPT + @")([^\}]*)(\}\})\s*\{\{hy icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=hy:$2$3|language=hy$4"); Skip = false; } // ARMENIAN: Find citations where |title= is in Armenian and the citation is preceded by {{hy icon}} template. // Replace |title= with |script-title=hy:<title>; add |language=hy; delete {{hy icon}} pattern = @"\{\{hy icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARMENIAN_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=hy:$2$3|language=hy$4"); Skip = false; } // ARMENIAN: Find citations where |title= is in Armenian and the citation contains |language=hy or |language=Armenian // Replace |title= with |script-title=hy:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_ARMENIAN_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Aa]rmenian|[Hh]y))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=hy:$2$3$4"); Skip = false; } // CHINESE, JAPANESE, and KOREAN: Find citations where |title= is in CJK and the citation is followed by {{ja icon}}, {{ko icon}}, or {{zh icon}} template. // Replace |title= with |script-title=xx:<title>; add |language=xx; delete {{xx icon}} pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CJK_SCRIPT + @")([^\}]*)(\}\})\s*\{\{((?:ja|ko|zh)) icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=$5:$2$3|language=$5$4"); Skip = false; } // CHINESE, JAPANESE, and KOREAN: Find citations where |title= is in CJK and the citation is preceded by {{ja icon}}, {{ko icon}}, or {{zh icon}} template. // Replace |title= with |script-title=xx:<title>; add |language=xx; delete {{xx icon}} pattern = @"\{\{((?:ja|ko|zh)) icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CJK_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$2script-title=$1:$3$4|language=$1$5"); Skip = false; } // CHINESE: Find citations where |title= is in CJK and the citation contains |language=zh or |language=Chinese. Replace |title= with |script-title=zh:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CJK_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Zz]h|[Cc]hinese))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=zh:$2$3$4"); Skip = false; } // JAPANESE: Find citations where |title= is in CJK and the citation contains |language=ja or |language=Japanese. Replace |title= with |script-title=ja:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CJK_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Jj]a|[Jj]apanese))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=ja:$2$3$4"); Skip = false; } // KOREAN: Find citations where |title= is in CJK and the citation contains |language=ko or |language=Korean. Replace |title= with |script-title=ko:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CJK_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Kk]o|[Kk]orean))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=ko:$2$3$4"); Skip = false; } // GREEK: Find citations where |title= is in Greek and the citation is followed by {{el icon}} template. // Replace |title= with |script-title=el:<title>; add |language=el; delete {{el icon}} pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_GREEK_SCRIPT + @")([^\}]*)(\}\})\s*\{\{el icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=el:$2$3|language=el$4"); Skip = false; } // GREEK: Find citations where |title= is in Greek and the citation is preceded by {{el icon}} template. // Replace |title= with |script-title=el:<title>; add |language=el; delete {{el icon}} pattern = @"\{\{el icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_GREEK_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=el:$2$3|language=el$4"); Skip = false; } // GREEK: Find citations where |title= is in Greek and the citation contains |language=el or |language=Greek or |language=<variant> Greek // where <variant> might be Ancient, Byzantine, or Mycenaean. // Replace |title= with |script-title=el:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_GREEK_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:(?:[Aa]ncient |[Bb]yzantine |[Mm]ycenaean )?[Gg]reek|[Ee]l))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=el:$2$3$4"); Skip = false; } // HEBREW and YIDDISH: Find citations where |title= is in Hebrew or Yiddish and the citation is followed by an {{he icon}} or {{yi icon}} template. // Replace |title= with |script-title=xx:<title>; add |language=xx; delete {{xx icon}} where xx is he or yi. pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_HEBREW_SCRIPT + @")([^\}]*)(\}\})\s*\{\{((?:he|yi)) icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=$5:$2$3|language=$5$4"); Skip = false; } // HEBREW and YIDDISH: Find citations where |title= is in Hebrew or Yiddish and the citation is preceded by an {{he icon}} or {{yi icon}} template. // Replace |title= with |script-title=xx:<title>; add |language=xx; delete {{xx icon}} where xx is he or yi. pattern = @"\{\{((?:he|yi)) icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_HEBREW_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$2script-title=$1:$3$4|language=$1$5"); Skip = false; } // HEBREW: Find citations where |title= is in Hebrew and the citation contains |language=he or |language=Hebrew. // Replace |title= with |script-title=he:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_HEBREW_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Hh]ebrew|[Hh]e))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=he:$2$3$4"); Skip = false; } // YIDDISH: Find citations where |title= is in Hebrew and the citation contains |language=Yi or |language=Yiddish. // Replace |title= with |script-title=he:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_HEBREW_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Yy]iddish|[Yy]i))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=yi:$2$3$4"); Skip = false; } // MALDIVIAN: Find citations where |title= is in Maldivian (Thaana) and the citation is followed by {{dv icon}} template. // Replace |title= with |script-title=dv:<title>; add |language=dv; delete {{dv icon}} pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_THAANA_SCRIPT + @")([^\}]*)(\}\})\s*\{\{dv icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=dv:$2$3|language=dv$4"); Skip = false; } // MALDIVIAN: Find citations where |title= is in Maldivian (Thaana) and the citation is preceded by {{dv icon}} template. // Replace |title= with |script-title=dv:<title>; add |language=dv; delete {{dv icon}} pattern = @"\{\{dv icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_THAANA_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=dv:$2$3|language=dv$4"); Skip = false; } // MALDIVIAN: Find citations where |title= is in Maldivian (Thaana) and the citation contains |language=dv or |language=Maldivian |language=divehi. // Replace |title= with |script-title=dv:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_THAANA_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Mm]aldivian|[Dd]v||[Dd]h?ivehi))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=dv:$2$3$4"); Skip = false; } // PERSIAN: Find citations where |title= is in Arabic, Cyrillic, and/or Hebrew and the citation is followed by an {{fa icon}} template. // Replace |title= with |script-title=fa:<title>; add |language=fa; delete {{fa icon}} pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_PERSIAN_SCRIPT + @")([^\}]*)(\}\})\s*\{\{fa icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=fa:$2$3|language=fa$4"); Skip = false; } // PERSIAN: Find citations where |title= is in Arabic, Cyrillic, and/or Hebrew and the citation is preceded by an {{fa icon}} template. // Replace |title= with |script-title=fa:<title>; add |language=fa; delete {{fa icon}} pattern = @"\{\{fa icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_PERSIAN_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=fa:$2$3|language=fa$4"); Skip = false; } // PERSIAN: Find citations where |title= is in Arabic, Cyrillic, and/or Hebrew and the citation contains |language=fa or |language=Persian. // Replace |title= with |script-title=fa:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_PERSIAN_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Pp]ersian|[Ff]a))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=fa:$2$3$4"); Skip = false; } // RUSSIAN, BOSNIAN, SERBIAN, UKRAINIAN: Find citations where |title= is in Cyrillic and the citation is followed by an {{ru icon}}, {{bs icon}}, {{sr icon}}, or {{uk icon}} template. // Replace |title= with |script-title=xx:<title>; add |language=xx; delete {{xx icon}} pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CYRILLIC_SCRIPT + @")([^\}]*)(\}\})\s*\{\{((?:ru|bs|sr|uk)) icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=$5:$2$3|language=$5$4"); Skip = false; } // RUSSIAN, BOSNIAN, SERBIAN, UKRAINIAN: Find citations where |title= is in Cyrillic and the citation is preceded by an {{ru icon}}, {{bs icon}}, {{sr icon}}, or {{uk icon}} template. // Replace |title= with |script-title=xx:<title>; add |language=xx; delete {{xx icon}} pattern = @"\{\{((?:ru|bs|sr|uk)) icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CYRILLIC_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$2script-title=$1:$3$4|language=$1$5"); Skip = false; } // RUSSIAN: Find citations where |title= is in Cyrillic and the citation contains |language=ru or |language=Russian. // Replace |title= with |script-title=ru:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CYRILLIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Rr]ussian|[Rr]u))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=ru:$2$3$4"); Skip = false; } // BOSNIAN: Find citations where |title= is in Cyrillic and the citation contains |language=bs or |language=Bosnian. // Replace |title= with |script-title=bs:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CYRILLIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Bb]osnian|[Bb]s))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=bs:$2$3$4"); Skip = false; } // SERBIAN: Find citations where |title= is in Cyrillic and the citation contains |language=sr or |language=Serbian. // Replace |title= with |script-title=sr:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CYRILLIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Ss]erbian|[Ss]r))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=sr:$2$3$4"); Skip = false; } // UKRAINIAN: Find citations where |title= is in Cyrillic and the citation contains |language=uk or |language=Ukrainian. // Replace |title= with |script-title=uk:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_CYRILLIC_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Uu]krainian|[Uu]k))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=uk:$2$3$4"); Skip = false; } // SINDHI: Find citations where |title= is in Arabic or Devanagari and the citation is followed by an {{sd icon}} template. // Replace |title= with |script-title=sd:<title>; add |language=sd; delete {{sd icon}} pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_SINDHI_SCRIPT + @")([^\}]*)(\}\})\s*\{\{sd icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=sd:$2$3|language=sd$4"); Skip = false; } // SINDHI: Find citations where |title= is in Arabic or Devanagari and the citation is preceded by an {{sd icon}} template. // Replace |title= with |script-title=sd:<title>; add |language=sd; delete {{sd icon}} pattern = @"\{\{sd icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_SINDHI_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=sd:$2$3|language=sd$4"); Skip = false; } // SINDHI: Find citations where |title= is in Arabic or Devanagari and the citation contains |language=sd or |language=Sindhi. Replace |title= with |script-title=sd:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_SINDHI_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Ss]indhi|[Ss]d))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=sd:$2$3$4"); Skip = false; } // THAI: Find citations where |title= is in Thai and the citation is followed by {{th icon}} template. // Replace |title= with |script-title=th:<title>; add |language=th; delete {{th icon}} pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_THAI_SCRIPT + @")([^\}]*)(\}\})\s*\{\{th icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=th:$2$3|language=th$4"); Skip = false; } // THAI: Find citations where |title= is in Thai and the citation is preceded by {{th icon}} template. // Replace |title= with |script-title=th:<title>; add |language=th; delete {{th icon}} pattern = @"\{\{th icon\}\}\s*(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_THAI_SCRIPT + @")([^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=th:$2$3|language=th$4"); Skip = false; } // THAI: Find citations where |title= is in Thai and the citation contains |language=th or |language=Thai. // Replace |title= with |script-title=dv:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\|\s*)title\s*=\s*(" + IS_THAI_SCRIPT + @")([^\}]*\|\s*language\s*=\s*(?:[Tt]hai|[Tt]h))([^\}]*\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1script-title=th:$2$3$4"); Skip = false; } // OTHERS: Find {{xx icon}} templates that follow a CS1 citation template. Remove {{xx icon and add |language=xx // __PROTECTED1__ citations were protected because of a mix of script and Latin so it is OK to move {{xx icon}} to |language=xx pattern = @"(\{\{(?:__PROTECTED1__)?" + IS_CS1E + @"[^\}]*)(\}\})\s*\{\{([A-Za-z][A-Za-z]) icon\}\}"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1|language=$3$2"); Skip = false; } // OTHERS: Find {{xx icon}} templates that precede a CS1 citation template. Remove {{xx icon and add |language=xx // __PROTECTED1__ citations were protected because of a mix of script and Latin so it is OK to move {{xx icon}} to |language=xx pattern = @"\{\{([a-z]{2,2}) icon\}\}\s*(\{\{(?:__PROTECTED1__)?" + IS_CS1E + @"[^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$2|language=$1$3"); Skip = false; } //--------------------------------------------------------------------------------------- // UNPROTECT: This is the last step of the conversion process. Once all of the other rules have run, if we protected any citations // by adding __PROTECTED__ or __PROTECTED1__ to them, search for those strings and replace them with nothing. ArticleText = Regex.Replace(ArticleText, @"__PROTECTED\d?__", ""); // ArticleText = Regex.Replace(ArticleText, @"__PROTECTED1?__", ""); //---------------------------< M U L T I - I C O N T O L A N G U A G E >---------------------------------- // In this section we attempt to place multiple (2–5) {{xx icon}} template language names into a comma separated value for |language= // LANGUAGE PARAMETER: Protect cs1|2 templates that have a value assigned to |language=. ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*)(" + IS_CS1E + @"[^}]*\|\s*language\s*=\s*[^\|\}]+)", "$1__PROTECTED__$2"); // INCLUDED TEMPLATES: Protect any citations that contain other templates. Matches any embedded template. // By the time we get here, embedded {{xx icon}} templates that could be removed have been removed by the INSIDE ICONS rule. ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*)(" + IS_CS1E + @"[^\{\}]*\{\{[^\}]*\}\})", "$1__PROTECTED__$2"); // five {{xx icon}} templates separated by ' and ', '&', '/' or space or nothing. ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_CS1E + @"[^\}]*)(\}\})\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}", "$1 |language=$3, $4, $5, $6, $7$2"); // four {{xx icon}} templates separated by ' and ', '&', '/' or space or nothing. ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_CS1E + @"[^\}]*)(\}\})\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}", "$1 |language=$3, $4, $5, $6$2"); // three {{xx icon}} templates separated by ' and ', '&', '/' or space or nothing. ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_CS1E + @"[^\}]*)(\}\})\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}", "$1 |language=$3, $4, $5$2"); // two {{xx icon}} templates separated by ' and ', '&', '/' or space or nothing. ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_CS1E + @"[^\}]*)(\}\})\s*\{\{([a-z]{2})\s*icon\}\}\s*(?:and|&|/)?\s*\{\{([a-z]{2})\s*icon\}\}", "$1 |language=$3, $4$2"); // UNPROTECT: This is the last step of the multi-icon process ArticleText = Regex.Replace(ArticleText, @"__PROTECTED__", ""); // CLEANUP: Find citations where Monkbot task 6 didn't properly ignore citations with embedded templates (pre-rev e) // Replace |title= with |script-title=dv:<title>; pattern = @"(\{\{" + IS_CS1 + @"[^\}]*\{\{[^\}]*)(\|\s*language\s*=[^\|\}]*)(\}\}[^\}]*)(\}\})"; if (Regex.Match (ArticleText, pattern).Success) { ArticleText = Regex.Replace(ArticleText, pattern, "$1$3$2$4"); Skip = false; } return ArticleText; } </syntaxhighlight></div></section><section class='wiki-section collapsible' id='section--settings-file'><h2 class='section-toggle'>AWB settings file</h2><div class='wiki-body'><syntaxhighlight lang="xml"> <?xml version="1.0" encoding="utf-8"?>  <AutoWikiBrowserPreferences xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xml:space="preserve" Version="5.5.5.0"> <Project>wikipedia</Project> <LanguageCode>en</LanguageCode> <CustomProject /> <Protocol>http://</Protocol> <LoginDomain /> <List> <ListSource>Template:ar icon</ListSource> <SelectedProvider>WhatTranscludesPageListProvider</SelectedProvider> <ArticleList /> </List> <FindAndReplace> <Enabled>false</Enabled> <IgnoreSomeText>false</IgnoreSomeText> <IgnoreMoreText>false</IgnoreMoreText> <AppendSummary>false</AppendSummary> <Replacements /> <AdvancedReps /> <SubstTemplates /> <IncludeComments>false</IncludeComments> <ExpandRecursively>true</ExpandRecursively> <IgnoreUnformatted>false</IgnoreUnformatted> </FindAndReplace> <Editprefs> <GeneralFixes>false</GeneralFixes> <Tagger>false</Tagger> <Unicodify>false</Unicodify> <Recategorisation>0</Recategorisation> <NewCategory /> <NewCategory2 /> <ReImage>0</ReImage> <ImageFind /> <Replace /> <SkipIfNoCatChange>false</SkipIfNoCatChange> <RemoveSortKey>false</RemoveSortKey> <SkipIfNoImgChange>false</SkipIfNoImgChange> <AppendText>false</AppendText> <AppendTextMetaDataSort>false</AppendTextMetaDataSort> <Append>true</Append> <Text /> <Newlines>2</Newlines> <AutoDelay>10</AutoDelay> <BotMaxEdits>0</BotMaxEdits> <SupressTag>false</SupressTag> <RegexTypoFix>false</RegexTypoFix> </Editprefs> <General> <AutoSaveEdit> <Enabled>false</Enabled> <SavePeriod>30</SavePeriod> <SaveFile /> </AutoSaveEdit> <SelectedSummary><a href='?title=User%3AMonkbot%2FTask_6%3A_CS1_language_support'>Task 6</a>: (<a href='?title=Wikipedia%3ABots%2FRequests_for_approval%2FMonkbot_6'>Bot trial</a>) replace language icon template with language parameter in CS1 citations; cleanup language icons;</SelectedSummary> <Summaries> <string>clean up</string> <string>re-categorisation per <a href='?title=WP%3ACFD'>CFD</a></string> <string>clean up and re-categorisation per <a href='?title=WP%3ACFD'>CFD</a></string> <string>removing category per <a href='?title=WP%3ACFD'>CFD</a></string> <string><a href='?title=Wikipedia%3ATemplate_substitution'>subst:'ing</a></string> <string><a href='?title=Wikipedia%3AWikiProject_Stub_sorting'>stub sorting</a></string> <string><a href='?title=WP%3AAWB%2FT'>Typo fixing</a></string> <string>bad link repair</string> <string>Fixing <a href='?title=Wikipedia%3ADisambiguation_pages_with_links'>links to disambiguation pages</a></string> <string>Unicodifying</string> <string>replace language icon template with language parameter in CS1 citations; cleanup language icons;</string> </Summaries> <PasteMore> <string /> <string /> <string /> <string /> <string /> <string /> <string /> <string /> <string /> <string /> </PasteMore> <FindText>\|\s*script-title=</FindText> <FindRegex>true</FindRegex> <FindCaseSensitive>false</FindCaseSensitive> <WordWrap>true</WordWrap> <ToolBarEnabled>false</ToolBarEnabled> <BypassRedirect>true</BypassRedirect> <AutoSaveSettings>false</AutoSaveSettings> <noSectionEditSummary>false</noSectionEditSummary> <restrictDefaultsortAddition>true</restrictDefaultsortAddition> <restrictOrphanTagging>true</restrictOrphanTagging> <noMOSComplianceFixes>false</noMOSComplianceFixes> <syntaxHighlightEditBox>false</syntaxHighlightEditBox> <highlightAllFind>false</highlightAllFind> <PreParseMode>false</PreParseMode> <NoAutoChanges>false</NoAutoChanges> <OnLoadAction>0</OnLoadAction> <DiffInBotMode>false</DiffInBotMode> <Minor>true</Minor> <AddToWatchlist>2</AddToWatchlist> <TimerEnabled>false</TimerEnabled> <SortListAlphabetically>false</SortListAlphabetically> <AddIgnoredToLog>false</AddIgnoredToLog> <EditToolbarEnabled>true</EditToolbarEnabled> <filterNonMainSpace>false</filterNonMainSpace> <AutoFilterDuplicates>false</AutoFilterDuplicates> <FocusAtEndOfEditBox>false</FocusAtEndOfEditBox> <scrollToUnbalancedBrackets>false</scrollToUnbalancedBrackets> <TextBoxSize>10</TextBoxSize> <TextBoxFont>Courier New</TextBoxFont> <LowThreadPriority>false</LowThreadPriority> <Beep>false</Beep> <Flash>false</Flash> <Minimize>false</Minimize> <LockSummary>false</LockSummary> <SaveArticleList>true</SaveArticleList> <SuppressUsingAWB>true</SuppressUsingAWB> <AddUsingAWBToActionSummaries>false</AddUsingAWBToActionSummaries> <IgnoreNoBots>false</IgnoreNoBots> <ClearPageListOnProjectChange>false</ClearPageListOnProjectChange> <SortInterWikiOrder>true</SortInterWikiOrder> <ReplaceReferenceTags>true</ReplaceReferenceTags> <LoggingEnabled>true</LoggingEnabled> <AlertPreferences /> </General> <SkipOptions> <SkipNonexistent>true</SkipNonexistent> <Skipexistent>false</Skipexistent> <SkipWhenNoChanges>true</SkipWhenNoChanges> <SkipSpamFilterBlocked>false</SkipSpamFilterBlocked> <SkipInuse>false</SkipInuse> <SkipWhenOnlyWhitespaceChanged>false</SkipWhenOnlyWhitespaceChanged> <SkipOnlyGeneralFixChanges>true</SkipOnlyGeneralFixChanges> <SkipOnlyMinorGeneralFixChanges>false</SkipOnlyMinorGeneralFixChanges> <SkipOnlyCosmetic>false</SkipOnlyCosmetic> <SkipOnlyCasingChanged>true</SkipOnlyCasingChanged>  <SkipIfRedirect>false</SkipIfRedirect> <SkipIfNoAlerts>false</SkipIfNoAlerts> <SkipDoes>true</SkipDoes> <SkipDoesNot>false</SkipDoesNot> <SkipDoesText>{{bots|Monkbot 6}}</SkipDoesText> <SkipDoesNotText></SkipDoesNotText> <Regex>true</Regex> <CaseSensitive>false</CaseSensitive> <AfterProcessing>false</AfterProcessing> <SkipNoFindAndReplace>false</SkipNoFindAndReplace> <SkipMinorFindAndReplace>false</SkipMinorFindAndReplace> <SkipNoRegexTypoFix>false</SkipNoRegexTypoFix> <SkipNoDisambiguation>false</SkipNoDisambiguation> <SkipNoLinksOnPage>false</SkipNoLinksOnPage> <GeneralSkipList /> </SkipOptions> <Module> <Enabled>true</Enabled> <Language>C# 2.0</Language> <Code></Code> </Module> <ExternalProgram> <Enabled>false</Enabled> <Skip>false</Skip> <Program /> <Parameters /> <PassAsFile>true</PassAsFile> <OutputFile /> </ExternalProgram> <Disambiguation> <Enabled>false</Enabled> <Link /> <Variants /> <ContextChars>20</ContextChars> </Disambiguation> <Special> <namespaceValues> <int>0</int> </namespaceValues> <remDupes>true</remDupes> <sortAZ>true</sortAZ> <filterTitlesThatContain>false</filterTitlesThatContain> <filterTitlesThatContainText /> <filterTitlesThatDontContain>false</filterTitlesThatDontContain> <filterTitlesThatDontContainText /> <areRegex>false</areRegex> <opType>0</opType> <remove /> </Special> <Tool> <ListComparerUseCurrentArticleList>0</ListComparerUseCurrentArticleList> <ListSplitterUseCurrentArticleList>0</ListSplitterUseCurrentArticleList> <DatabaseScannerUseCurrentArticleList>0</DatabaseScannerUseCurrentArticleList> </Tool> <Plugin> <PluginPrefs> <Name>CSV Loader</Name> <PluginSettings> <anyType xsi:type="PrefsKeyPair"> <Name>TextMode</Name> <Setting xsi:type="xsd:string">Append</Setting> </anyType> <anyType xsi:type="PrefsKeyPair"> <Name>InputText</Name> <Setting xsi:type="xsd:string" /> </anyType> <anyType xsi:type="PrefsKeyPair"> <Name>ColumnHeaders</Name> <Setting xsi:type="xsd:string" /> </anyType> <anyType xsi:type="PrefsKeyPair"> <Name>Skip</Name> <Setting xsi:type="xsd:boolean">true</Setting> </anyType> <anyType xsi:type="PrefsKeyPair"> <Name>Separator</Name> <Setting xsi:type="xsd:string">,</Setting> </anyType> <anyType xsi:type="PrefsKeyPair"> <Name>CreateLists</Name> <Setting xsi:type="xsd:boolean">false</Setting> </anyType> <anyType xsi:type="PrefsKeyPair"> <Name>ListSeparator</Name> <Setting xsi:type="xsd:string">^</Setting> </anyType> <anyType xsi:type="PrefsKeyPair"> <Name>FindReplace</Name> <Setting xsi:type="xsd:boolean">false</Setting> </anyType> <anyType xsi:type="PrefsKeyPair"> <Name>EditSummary</Name> <Setting xsi:type="xsd:string" /> </anyType> </PluginSettings> </PluginPrefs> </Plugin> </AutoWikiBrowserPreferences> </syntaxhighlight></div></section></div></main> <footer class="site-footer"> <div class="footer-container"> <div class="footer-links"> <a href="/about.php">About</a> <a href="/help.php">Help</a> <a href="/updates.php">Updates</a> <a href="/contact.php">Contact</a> <a href="/privacy.php">Privacy</a> <a href="/terms.php">Terms</a> <a href="https://github.com/yourusername/friendly-wiki" target="_blank" rel="noopener">GitHub</a> </div> <div class="footer-copy"> © 2025 Friendly Wiki. All rights reserved. </div> </div> </footer> <script> const toggle = document.getElementById('mobileMenuToggle'); const menu = document.getElementById('mobileMenu'); toggle.addEventListener('click', () => { menu.classList.toggle('active'); }); </script>  <script> document.addEventListener("DOMContentLoaded", function () { const toggles = document.querySelectorAll('.section-toggle'); toggles.forEach(toggle => { toggle.addEventListener('click', function () { const section = toggle.closest('.collapsible'); const body = section.querySelector('.wiki-body'); body.classList.toggle('collapsed'); }); }); }); </script>