User:Moneytrees/CCI guide

{{Wikipedia how-to|page=to edit and cleanup pages listed at Wikipedia:Contributor copyright investigations.}}

This is my simple guide to editing at CCI - Contributor Copyright Investigations. Marking stuff down, and what to do in special situations, based off of my own experience.

If you are experienced with this area on Wikipedia, feel free to add other advice. For a list I have made of CCIs, see User:Moneytrees/CCI Sort.

Basic steps of CCI

= Basic steps =

CCI's vary greatly on subject matter and the way in which the subject copied over content. Becoming familiar with the subject's way of editing-- in what ways they copied from sources, the type of citation style they used, and what sources they were fond of copying from-- is useful when focusing down on one CCI. [https://web.archive.org/ The Wayback Machine], a website that takes snapshots of websites that have gone offline over the years, is essential for work at CCI.

  • 1. Click on the diffs and check the cited sources. Simply scanning an article with Earwig isn't enough, you need to be thorough with your investigation.
  • 2. On an article that has a long history and many edits from many different users, you may not even need to run a check. Instead, take the diff link and paste it into the "URL comparison" box and run the comparison on the article on the listing. Look at the text highlighted, which will show whether or not if it is still in the article. You can also use the Who Wrote That? extension to see if the content is still in the article.

:*2.1 If it is still in the article, then compare the source cited in the edit to the article using the above process. If there is no source, try looking through the next few edits in the page's history to determine when a ref was inserted-- it may have been removed over the years.

:*2.2 If it is no longer in the article, then mark the listing with ? Rewritten/removed since --~~~~ Make sure it wasn't moved to a different article. Some CCI subjects use sockpuppets to repeatedly edit the same article. Make sure what was removed wasn't rewritten by one of their socks.

  • 3. Enter the article in and run the scan. Alternatively, just compare the sources cited in the edit with the edit id, as long as the source is not dead. I strongly encourage looking at the sources cited in the initial edits, as they may no longer be in the article, and earwig only does a limited web search, making it unlikely to find them.
  • 4. View the results. Ignore the percentage, go off of highlighted text. At least check everything above ten percent. Just because something doesn't register on Earwig doesn't mean it's not a copyright violation-- close paraphrasing is not easily detected, and you'll sometimes have to manually compare the article and the source to find it.
  • 5. If you find no violation, write {{n}} Checked --~~~~
  • 6. If you do find (a) violation(s), remove or reword it. Make sure the article is still coherent afterwards. When debating between rewording or removing, consider how essential the content is to the article and how much would need to be reworded. Don't feel guilty for choosing remove over rewrite. Depending on how large the violation is, mark the article for a revdel; I highly recommend you install User:Enterprisey/cv-revdel for this. Replace the diffs next to the listing with {{y}} removed --~~~~
  • 7. Keep an eye out for sources in the Public Domain or under free license; some of them may be attributed properly, some will not. They tend to be US government sources or very old (pre-{{#expr:{{CURRENTYEAR}}-95}}) material. Keep in mind the public domain status of books in other countries are different than America's; if you are unsure of the public domain status, see Commons:Commons:Copyright rules by territory. See the bottom of this page for a chart showing the compatible licenses.
  • 7.1. If it is unattributed, add the {{tl|Source-attribution}} or {{tl|Creative Commons text attribution notice}}, add it into the ref [https://en.wikipedia.org/w/index.php?title=Guyana_Sugar_Corporation&diff=prev&oldid=936890180 like I do here].
  • 8. For half/un attributed interwiki translations, add the article it was translated from to the talk page, [https://en.wikipedia.org/w/index.php?title=Talk:Diego_de_Holgu%C3%ADn&diff=936471095&oldid=767166259 like I do here].
  • 9. For unattributed in wiki copying, add a note to the talk page, [https://en.wikipedia.org/w/index.php?title=Talk:Llangennith&diff=956000701&oldid=857676600 like I do here].
  • 10. For Cut and Paste moves that don't have parallel histories (edits in between the paste on both articles, making history merging impossible), tag the article with Template:History merge (can be found in Twinkle).
  • 11. For cases where you are unsure about who copied from what, the paste is very complicated, or it could be deleted but is not a straight G12, blank the article using {{subst:copyvio|url=INSERTURL}} and follow the instructions on the generated notice. Notifying CCI subjects that an article was blanked is not necessary.
  • 12. For book or other "offline/paywall" type violations, look up sentences and unique phrases used in the edit on Google Books/google to try and find a match, although this is not always reliable as several books have no preview and Google can be random in what it decides to show. Additionally, you can look for it through the The Wikipedia Library. Asking someone for it through Wikipedia:WikiProject Resource Exchange/Resource Request and looking around for a copy on archive.org are alternatives. Simply getting a copy through your institution, buying it, or borrowing from a local library can also work. If none of these options are workable and the content is suspicious it is best to remove it. If you need to verify if actual copying happened, feel free to use more dubious methods-- sometimes you need to break a rule to enforce another.

Detecting mirrors

Keep in mind, many sites have copied from Wikipedia over the years, and using the search engine with earwig will almost always find a handful, so be careful when removing content. If it seems like the website copied from Wikipedia, CTRL F and type "Wikipedia", which will often highlight along the lines of "Taken from wikipedia" on the scanned web page. Always be wary of user-generated websites; for example, every Wikipedia article has been copied by at least one BlogSpot site. Be careful when assessing IMDb violations; they've repeatedly copied Wikipedia plot summaries, and we've repeatedly copied them.

=Earwig times out when loading up this one site=

Certain sites don't like earwig and will time out when it tries scanning them; The Independent and some PDFs are examples. If this happens, go to a website that will find Google web caches, which are saved versions of pages that earwig should always be able to read. https://cachedpage.co/ is an example; some Archive.org saves can be viable workarounds as well.

Presumptive removals

{{See also|User:The4lines/Presumptive removals}}

In some cases, sources copied from by the CCI subjects are inaccessible, of questionable veracity, or significant money would have to be spent to access them. There are also cases where infringement is guaranteed and obvious in most significant edits. In these cases, it is best to remove the content inserted. Note that this is a last resort option; try and find if you can access the content before doing this. Presumptive removals may also be warranted in cases where the subject copied a specific thing (e.g. plot summaries), figuring out where the subject copied from would be too difficult, or where the CCI could be wrapped up quicker by just removing everything. If the sources are inaccessible and stubbing/removing the problematic content would not be feasible, tag the article for presumptive deletion. For presumptive removals and deletions:

Presumptive removal over copyright concerns, please see: Wikipedia:Contributor copyright investigations/INSERTNAME

{{subst:copyvio|url=Presumptive deletion over copyright concerns, please see: Wikipedia:Contributor copyright investigations/INSERTNAME}}

If the amount of text you remove is major (+500 or important text), please leave a note on the articles talk page with {{subst:CCI|INSERTNAME}}

Please mark the associated listing with {{x}} and something along the lines of {{x}} Presumptive removal ~~~~/{{x}} Tagged for presumptive deletion ~~~~

License guide

CLASS="wikitable" ALIGN=CENTER STYLE="width:70%; text-align:center; margin-left:6em"
COLSPAN=2 STYLE="background-color:blue; color:white;" | License Compatibility with WikipediaFor text only; Please see Wikipedia:File_copyright_tags for licences allowed with files
COLSPAN=1 | Licenses compatible with WikipediaCOLSPAN=1 | Licenses not compatible with Wikipedia
COLSPAN=2 STYLE="font-weight:normal; font-style:italic" | Creative Commons licenses
style="vertical-align: top;"

|

  • CC BY, all versions and ports, up to and including 4.0
  • CC BY-SA 2.0, 2.5, 3.0, 4.0
  • [https://creativecommons.org/share-your-work/public-domain/cc0/ CC0]

|

  • CC BY-NC
  • CC BY-NC-ND
  • CC BY-ND
  • CC BY-NC-SA
  • CC BY-SA 1.0
COLSPAN=2 STYLE="font-weight:normal; font-style:italic" | Other licenses
*GFDL and CC BY or CC BY-SA

|

  • any GNU-only license (including GFDL)