User talk:Citation bot/Archive 6
Bot does not handle aliases
The bot also adds "pages=" when there is already a "p=" or "pp=" or "page=" card. AManWithNoPlan (talk) 21:00, 26 December 2015 (UTC)
The solution is to edit objects.php in the functions add_if_new() adding the needed things such as changing if (( $this->blank("pages") && $this->blank("page"))
into if (( $this->blank("pages") && $this->blank("page") && $this->blank("pp") && $this->blank("p"))
Also will need to add some, like this: case 'issue':
if ($this->blank("issue") && $this->blank("number")) {
return $this->add($param, $value);
}
return false;
since they are caught in the catch all: default:
if ($this->blank($param)) {
return $this->add($param, sanitize_string($value));
}
}
AManWithNoPlan (talk) 15:06, 9 August 2016 (UTC) See this diff, which results in a slew of citation errors for having both pages and pp, and note that in many of those entries, it munges the page range into an (inaccurate) single page. Squeamish Ossifrage (talk) 13:18, 18 October 2016 (UTC)
Bot is running but bug is not fixed. https://en.wikipedia.org/w/index.php?title=S-50_%28Manhattan_Project%29&type=revision&diff=773923091&oldid=773516462 Bot must be shut down until bugs can be fixed. Hawkeye7 (talk) 06:53, 5 April 2017 (UTC)
{{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 02:17, 7 September 2017 (UTC) Edits citations inside of nowiki tags
The solution is to deal with this at the same time that the code escapes out comments AManWithNoPlan (talk) 04:42, 6 August 2016 (UTC) In objects.php add these lines right after equivalent comment lines: $comments = $this->extract_object(Comment);
$nowiki = $this->extract_object(Nowiki);
$this->replace_object($comments);
$this->replace_object($nowiki);
class Comment extends Item {
const placeholder_text = '# # # Citation bot : comment placeholder %s # # #';
const regexp = '~<!--.*-->~us'; // Note from AManWithNoPlan: this regex is wrong---it is greedy: see other bot bugs on this talk page
const treat_identical_separately = FALSE;
public function parse_text($text) {
$this->rawtext = $text;
}
public function parsed_text() {
return $this->rawtext;
}
}
class Nowiki extends Item {
const placeholder_text = '# # # Citation bot : no wiki placeholder %s # # #'; // Have space in nowiki so that it does not through some crazy bug match itself recursively
const regexp = '~<nowiki>.*?</nowiki>~us';
const treat_identical_separately = FALSE;
public function parse_text($text) {
$this->rawtext = $text;
}
public function parsed_text() {
return $this->rawtext;
}
}
AManWithNoPlan (talk) 16:08, 9 August 2016 (UTC) {{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 02:17, 7 September 2017 (UTC) Duplicating jstor
The bad code are these lines of get_identifiers_from_url() in objects.php: $this->rename("url", "jstor", $match[1]);
$this->rename("url", "bibcode", urldecode($bibcode[1]));
$this->rename("url", "pmc", $match[1] . $match[2]);
$this->rename('url', 'asin', $match['id']);
They should match the doi code, which is a forget followed by a set: $this->forget('url');
$this->set("doi", urldecode($match[1]));
I can't explain why one works and the other does not, but that is what happens. AManWithNoPlan (talk) 03:13, 9 August 2016 (UTC)
Google books data is sometimes rubbish
Also: https://en.wikipedia.org/w/index.php?title=Homing_pigeon&diff=prev&oldid=682284024
foreach ($xml->dc___creator as $author) {
$this->add_if_new("author" . ++$i, formatAuthor(str_replace("___", ":", $author)));
}
to: foreach ($xml->dc___creator as $author) {
if( $author != "Hearst Magazines" ) { // Catch common google bad authors
$this->add_if_new("author" . ++$i, formatAuthor(str_replace("___", ":", $author)));
}
}
{{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 02:17, 7 September 2017 (UTC) Erroneously reports DOI as broken
I thought this was fixed and marked it as so. Currently, doi is flagged as invalid if crossref fails, which is reasonable, but need to also check is dx.doi.org also failed AManWithNoPlan (talk) 00:42, 18 November 2015 (UTC)
$this->add_if_new('doi_brokendate', date('Y-m-d'));
to: $url_test = "http://dx.doi.org/".$doi ;
$headers_test = get_headers($url_test, 1);
if(empty($headers_test['Location']))
$this->add_if_new('doi_brokendate', date('Y-m-d'));
and change this code: $this->set("doi_brokendate", date("Y-m-d"));
to: $url_test = "http://dx.doi.org/".$doi ;
$headers_test = get_headers($url_test, 1);
if(empty($headers_test['Location']))
$this->set("doi_brokendate", date("Y-m-d"));
AManWithNoPlan (talk) 16:28, 9 August 2016 (UTC) Is this the same bug? Another editor reverted before I could act, but I checked and the doi is not broken at all. Hawkeye7 (talk) 22:25, 4 October 2016 (UTC)
{{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 02:18, 7 September 2017 (UTC) Bot created arXiv= parameter error
The bot removed the class portion of the arXiv parameter value in
The bot should not modify valid
Here is an example of one that gets broken. {{cite arXiv|eprint=astro-ph/0409583 | title = Exploring the Divisions and Overlap between AGB and Super-AGB Stars and Supernovae | last1 = Eldridge | first1 = J. J. | last2 = Tout | first2 = C. A.|class=astro-ph|date=2004 }} AManWithNoPlan (talk) 15:49, 9 August 2016 (UTC) Here is the offending source code from objects.php: $eprint = str_ireplace("arXiv:", "", $this->get('eprint') . $this->get('arxiv'));
if ($class && substr($eprint, 0, strlen($class) + 1) == $class . '/')
$eprint = substr($eprint, strlen($class) + 1);
$this->set($arxiv_param, $eprint);
that should be: $eprint = str_ireplace("arXiv:", "", $this->get('eprint') . $this->get('arxiv'));
//if ($class && substr($eprint, 0, strlen($class) + 1) == $class . '/')
// $eprint = substr($eprint, strlen($class) + 1);
$this->set($arxiv_param, $eprint);
AManWithNoPlan (talk) 15:56, 9 August 2016 (UTC) This only occurs if class is set AManWithNoPlan (talk) 00:26, 14 October 2016 (UTC) {{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 03:19, 7 September 2017 (UTC) Link at top of results page leads to error
This code in objects.php : quiet_echo ("\n<hr>[" . date("H:i:s") . "] Processing page '<a href='http://en.wikipedia.org/wiki/" . addslashes($this->title) . "' style='text-weight:bold;'>{$this->title}</a>' — <a href='http://en.wikipedia.org/?title=". addslashes(urlencode($this->title))."&action=edit' style='text-weight:bold;'>edit</a>—<a href='http://en.wikipedia.org/?title=" . addslashes(urlencode($this->title)) . "&action=history' style='text-weight:bold;'>history</a> <script type='text/javascript'>document.title=\"Citation bot: '" . str_replace("+", " ", urlencode($this->title)) ."'\";</script>");
needs changed to quiet_echo ("\n<hr>[" . date("H:i:s") . "] Processing page '<a href='http://en.wikipedia.org/?title=" . addslashes($this->title) . "' style='text-weight:bold;'>{$this->title}</a>' — <a href='http://en.wikipedia.org/?title=". addslashes(urlencode($this->title))."&action=edit' style='text-weight:bold;'>edit</a>—<a href='http://en.wikipedia.org/?title=" . addslashes(urlencode($this->title)) . "&action=history' style='text-weight:bold;'>history</a> <script type='text/javascript'>document.title=\"Citation bot: '" . str_replace("+", " ", urlencode($this->title)) ."'\";</script>");
AManWithNoPlan (talk) 21:10, 6 August 2016 (UTC) {{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 02:18, 7 September 2017 (UTC) Error converting url to arxiv parameter
Just need to strip the .pdf off of url when converting url to eprint. Super easy code change. AManWithNoPlan (talk) 19:19, 9 January 2016 (UTC) Change in objects.php $this->add_if_new("arxiv", $match[1]);
if (strpos($this->name, 'web')) $this->name = 'Cite arxiv';
to $match[1] = str_replace ( ".pdf" , "" , $match[1] )
$this->add_if_new("arxiv", $match[1]);
if (strpos($this->name, 'web')) $this->name = 'Cite arxiv';
and change this: return "{{Cite arxiv | eprint={$match[1]} }}";
to: $match[1] = str_replace ( ".pdf" , "" , $match[1] )
return "{{Cite arxiv | eprint={$match[1]} }}";
JSTOR plant link mistaken for journal
That's annoying that JSTOR has chosen to add a new type of stable link (although it does start with plant) AManWithNoPlan (talk) 19:21, 7 February 2016 (UTC) The fix needs put in objects.php the third through fifth lines if (strpos($url, "sici")) {
#Skip. We can't do anything more with the SICI, unfortunately.
elseif (strpos($url, "plants")) {
#Skip. We can't do anything more with the plants, unfortunately.
} else
AManWithNoPlan (talk) 21:00, 6 August 2016 (UTC)
{{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 02:18, 7 September 2017 (UTC) When bibcodes ends with a dot, it leaves the dot out
I think the solution is to modify objects.php to add a special case for bibcodes, to sit above the catch all code: default:
if ($this->blank($param)) {
return $this->add($param, sanitize_string($value));
}
such as: case 'bibcode':
if ($this->blank($param)) {
$bibcode_pad = strlen($value) - 19;
if($bibcode_pad > 0 ) { // Paranoid, don't want a negative value, if bibcodes get longer
value = $value . str_repeat( ".", $bibcode_pad); // Add back on trailing periods
}
return $this->add($param, $value);
}
return false;
AManWithNoPlan (talk) 21:34, 6 August 2016 (UTC)
{{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 02:18, 7 September 2017 (UTC) Comments cause trouble
As far as I can tell, there were no duplicated parameters when the bot did its edit. – Jonesey95 (talk) 02:54, 9 November 2014 (UTC)
Adding bogus DUPLICATE_ added: https://en.wikipedia.org/w/index.php?title=509th_Composite_Group&diff=636859536&oldid=636220208 DUPLICATE_ added: https://en.wikipedia.org/w/index.php?title=Shapley%E2%80%93Folkman_lemma&diff=655089982&oldid=651991293
{{cite book|publisher=Europa<!-- -->}}{{cite news<!-- -->|publisher=The}} Here are a variety of lines from the bot source code (i might have missed one) const regexp = '~<!--.*-->~us';
$comment_regexp = "~(<!--.*?)\|(.*?-->)~";
while(preg_match("~<!--.*?-->~", $c, $match)) {
if (preg_match_all("~<!--[\s\S]*?-->~", $page_code, $match)) {
I think the problem is the first one. It is greedy. The .* needs to be .*? like number three. AManWithNoPlan (talk) 20:41, 7 August 2016 (UTC) https://en.wikipedia.org/w/index.php?title=2010_New_York_Yankees_season&type=revision&diff=797318586&oldid=796799252 Plastikspork https://en.wikipedia.org/w/index.php?title=Alpha_particle&diff=795641460&oldid=795641155 Headbomb
Google data is not always right, and the bot is not telepathic
The date is grabbed from Google and not massaged at all. AManWithNoPlan (talk) 00:40, 23 January 2017 (UTC)
Bot generated invalid cite data "# # # comment"
This is because the search and replace is case sensitive, which is fine an dandy 99.9% of the time. Obviously, 0.1% of the time it fails. AManWithNoPlan (talk) 15:16, 5 April 2017 (UTC) {{resolved}} in development branch. Live soon. AManWithNoPlan (talk) 02:33, 7 September 2017 (UTC) Incorrect DOI removal
This is the comments bug. The bot uses a greedy search for comments. AManWithNoPlan (talk) 13:13, 22 July 2017 (UTC)
{{resolved}} in development branch. Soon to be live. AManWithNoPlan (talk) 02:18, 7 September 2017 (UTC) Authors must be people, not companies
Perhaps the bot could look for keywords like 'magazine', 'journal', 'newspaper', etc and common variations (eg upper/lowercase, plurals). Stepho talk 09:29, 16 August 2017 (UTC) {{resolved}} in development branch for a few select authors. Soon to be live. AManWithNoPlan (talk) 02:18, 7 September 2017 (UTC) issue vs. volume confusion for journals with no volumes
http://search.crossref.org/?q=10.3897/zookeys.445.7778 The cross-ref data is wrong. So, it is not a bot bug, but the bot could easily fix it. AManWithNoPlan (talk) 19:15, 2 October 2015 (UTC)
The solution is to add code to objects.php in the public function add_if_new($param, $value) AManWithNoPlan (talk) 02:10, 7 August 2016 (UTC) case 'volume':
if ($this->blank($param)) {
if ( $this->get('journal') == "ZooKeys" ) add_if_new('issue',$value) ; // This journal has no volume
return $this->add($param, $value);
}
return false;
And change this code: if ($this->blank("journal") && $this->blank("periodical") && $this->blank("work")) {
return $this->add($param, sanitize_string($value));
}
to if ($this->blank("journal") && $this->blank("periodical") && $this->blank("work")) {
if ( sanitize_string($value) == "ZooKeys" ) $this->blank("volume") ; // No volumes, just issues.
return $this->add($param, sanitize_string($value));
}
Might be best long term to have a global array of such journals rather than having to keep adding them one by one.
|
Extended content
|
---|
Data on NCBI seems to be ok: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC83919/ where the Journal is written as "Mol Cell Biol." on the webpage and as "MOLECULAR AND CELLULAR BIOLOGY" in the full text pdf. What to do in those cases? Include "Molecular and Cellular Biology" in: https://en.wikipedia.org/wiki/User:Citation_bot/capitalisation_exclusions in sush cases? The same with -"The Journal of biological chemistry" e.g. PMID 9858585 -"The Journal of cell biology" e.g. PMID 9763423 an other cases seen in https://en.wikipedia.org/wiki/Special:RecentChangesLinked/Category:Cite_doi_templates ? Thanks--Saimondo (talk) 16:21, 3 August 2014 (UTC)
You are of course right, it´s no error it´s the catalog style NCBI is using. I don´t have the complete overview what capitalization format is obtained by the doi or issn vs pmid queries. But if you use the cite-> templates-> cite journal option here in the edit window and use autofill with the doi:10.1128/MCB.00698-14 you get "Molecular and Cellular Biology" if you use the same publications PMID 25022755 with autofill you get "Molecular and cellular biology". If capitalization means also harmonization I think few wikipedians would be against it. Furthermore, as far as I understand https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style#Titles_of_works the capitalization format like above should be ok (I have the impression that most journals use capitalization for their own names on their homepages/pdfs). Should we ask on the Manual of style talk page to see if there´s a consensus for capitalization? In case someone is interested, here is a recent reply of an email I (re-)sent to NCBI some time ago: "...Standard cataloging requires that the first word in the full journal title begins with an upper case letter and remaining words (except for proper nouns) begin with lower case. Journal title abbreviations begin with all upper-case letters. I checked the XML data for several journals and found that each of the title listed in this manner. You can see several examples at the bottom of this document: Fact Sheet: Construction of the National Library of Medicine Title Abbreviations http://www.nlm.nih.gov/pubs/factsheets/constructitle.html Sincerely, Ellen M. L. ... -Original Message- Dear NCBI Team, in the xml data of a specific article https://www.ncbi.nlm.nih.gov/pubmed/9858585?dopt=Abstract&report=xml&format=text the journal name is written "Molecular and cellular biology" and the abbreviation is "Mol Cell Biol.". I think the correct journal name should be "Molecular and Cellular Biology" as written on the journal homepage http://mcb.asm.org/content/19/1/612.long ." Saimondo (talk) 17:29, 10 September 2014 (UTC)
AManWithNoPlan (talk) 20:43, 2 January 2016 (UTC)
changing case "periodical": case "journal":
if ($this->blank("journal") && $this->blank("periodical") && $this->blank("work")) {
return $this->add($param, sanitize_string($value));
}
return false;
into case "periodical": case "journal":
if ($this->blank("journal") && $this->blank("periodical") && $this->blank("work")) {
return $this->add($param, format_title_text(sanitize_string($value)));
}
return false;
|
- New github pull submitted that applies title case in more locations. AManWithNoPlan (talk) 16:55, 7 September 2017 (UTC)
- Need to add option $title = mb_convert_case($title, MB_CASE_TITLE, "UTF-8") AManWithNoPlan (talk) 13:22, 10 September 2017 (UTC)
{{resolved}} in Dev AManWithNoPlan (talk) 18:25, 2 October 2017 (UTC)
citing using pmid creates author1 instead of last1
- Status
- improvement
- Reported by
- Ihaveacatonmydesk (talk) 21:33, 30 May 2016 (UTC)
- Type of bug
- Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
- What happens
- {{cite journal|pmid=12858711 |year=2003 |author1=Lovallo |first1=D |title=Delusions of success. How optimism undermines executives' decisions |journal=Harvard business review |volume=81 |issue=7 |pages=56–63, 117 |last2=Kahneman |first2=D }}
- What should happen
- {{cite journal|pmid=12858711 |year=2003 |last1=Lovallo |first1=D |title=Delusions of success. How optimism undermines executives' decisions |journal=Harvard business review |volume=81 |issue=7 |pages=56–63, 117 |last2=Kahneman |first2=D }}
- Replication instructions
- use a pmid an click the button to autocomplete - also does the same thing when inputting a url into cite book, like {{cite book|url=https://books.google.com/?id=FI7l8O1tlkkC}}
- We can't proceed until
- Bot operator's feedback on what is feasible
|author1=
is an alias of |last1=
. This would be a cosmetic fix (in the code) only. – Jonesey95 (talk) 22:55, 31 May 2016 (UTC)
- Agreed, but since it's such a simple fix it would be a shame not to do it. Also I actively search for "author" when most of the refs are
|lastn=
/|firstn=
to edit them for consistency, and that creates false positives. Ihaveacatonmydesk (talk) 08:28, 1 June 2016 (UTC) - When splitting an author into last and first, it keeps the original type when setting the last name. Pull request done to switch to last. https://github.com/ms609/citation-bot/pull/169 AManWithNoPlan (talk) 15:14, 25 September 2017 (UTC)
{{resolved}} in Dev AManWithNoPlan (talk) 18:42, 2 October 2017 (UTC)
URL in the website field instead of the URL field (common newbie error)
- Status
- feature request
- Reported by
- Kerry (talk) 06:48, 1 March 2017 (UTC)
- Type of bug
- Potentially Deleterious: Invisible Human-input data is deleted
- What happens
- the bot is removing the accessdate from citations saying "Removed accessdate with no specified URL"
when the citation does contain a URL but it is in the website field (a common mistake made by newbies, especially those who don't understand the jargon "URL" -- in my experience of doing training in public libraries, many people call these "web addresses" and not "URL")
- What should happen
- Ideally. If a citation has a URL in the website field and the URL field is empty, move the URL into the correct field and empty the website field. If that's not possible for the bot to do, then don't delete the accessdate, but try and warn in some way. (In a super-ideal world, the editor software would not use the term URL but say "address of web page", but I assume this is out of scope here).
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=George_Christensen_%28politician%29&type=revision&diff=768005628&oldid=767963006
- Replication instructions
- undo it and run the bot again
- We can't proceed until
- Agreement on the best solution
The access date that is deleted is not actually shown to humans. Attempt to have bot do this: https://github.com/ms609/citation-bot/pull/172 AManWithNoPlan (talk) 21:37, 25 September 2017 (UTC)
{{resolved}} in Dev AManWithNoPlan (talk) 18:26, 2 October 2017 (UTC)
lowercasing "the" as the first word in a subtitle?
Is it correct for this bot to remove capitalization from the word "the" when it's immediately following a colon as the first word in a subtitle? That's a wordy sentence, and might be confusing, so I'll also ask: is it correct for this bot to do this: https://en.wikipedia.org/w/index.php?title=Initiations_%28Star_Trek%3A_Voyager%29&type=revision&diff=795130102&oldid=795049756 ? — fourthords | =Λ= | 15:11, 12 August 2017 (UTC)
- That is a good question. I edited https://en.wikipedia.org/wiki/User:Citation_bot/capitalisation_exclusions to make this Star Trek magazine have a capital The. Generally, a the is not capitalized in the middle of a sentence, but this is a weird case where a colon really is being used more like a period than a colon. AManWithNoPlan (talk) 13:44, 13 August 2017 (UTC)
- The convention I've usually seen is that the word following a colon in a complete English sentence is not capitalized (although I think in earlier styles it might have been) but the word following a colon in the title of a publication is capitalized. For instance the mathematics publication database MathSciNet, which aggressively lowercases even words after the first in titles of books (unlike most other bibliographic sources), nevertheless follows this convention. —David Eppstein (talk) 18:23, 13 August 2017 (UTC)
- There is a style out there which will capitalize the first word after a colon even in a full sentence, but that's a rare one; most of my experience has been the same as David's. --Izno (talk) 19:58, 25 August 2017 (UTC)
- This might be working in the development version on github. Not yet deployed to wiki land. AManWithNoPlan (talk) 21:37, 12 September 2017 (UTC)
- There is a style out there which will capitalize the first word after a colon even in a full sentence, but that's a rare one; most of my experience has been the same as David's. --Izno (talk) 19:58, 25 August 2017 (UTC)
{{resolved}} in Dev AManWithNoPlan (talk) 18:33, 2 October 2017 (UTC)
Creates invalid ISO date
- Status
- new bug
- Reported by
- Keith D (talk) 21:06, 18 August 2017 (UTC)
- Type of bug
- Deleterious: Human-input data is deleted or articles are otherwise significantly affected. Many bot edits require undoing.
- What happens
- Changes hyphens to em-dashes in ISO dates (I had just corrected the date earlier today)
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Ye_Wenling&diff=796142445&oldid=796141424
- We can't proceed until
- Bot operator's feedback on what is feasible
- Requested action from maintainer
- Not change two dashed dates
- This looks like a GIGO error. "2007-08-01" should not be in
|year=
. A format like that should be in|date=
. The bot could perhaps ignore this incorrect format, leaving it for a human editor to fix. In this case, the bot did human editors a favor by highlighting an erroneous parameter value. – Jonesey95 (talk) 22:12, 18 August 2017 (UTC)
I guess we agree on this AManWithNoPlan (talk) 19:24, 5 September 2017 (UTC)
- Have to disagree with this conclusion, something should be done in the code to stop this happening even when there is incorrect usage of fields. Keith D (talk) 21:05, 5 September 2017 (UTC)
- Garbage in; Garbage out. I will write a patch to detect more than done dash. That why the original garbage satay put AManWithNoPlan (talk) 00:46, 6 September 2017 (UTC)
- Need to add this
&& (substr_count($text, '-') < 2 || substr_count($text, '--') != 0 )
(this means that if more than one dash is found, then do not change, unless there are dashes next to each other). Probably change|year=
to|date=
. AManWithNoPlan (talk) 16:20, 13 September 2017 (UTC)
- Need to add this
- Garbage in; Garbage out. I will write a patch to detect more than done dash. That why the original garbage satay put AManWithNoPlan (talk) 00:46, 6 September 2017 (UTC)
- Have to disagree with this conclusion, something should be done in the code to stop this happening even when there is incorrect usage of fields. Keith D (talk) 21:05, 5 September 2017 (UTC)
{{resolved}} in Dev AManWithNoPlan (talk) 18:36, 2 October 2017 (UTC)
Bot broke a URL
This edit altered an dash to an ndash in a URL within a page=
parameter. You need to check that if the page or pages parameter includes an open square bracket nothing is changed before a space or a close square bracket. -- PBS (talk) 16:29, 27 August 2017 (UTC)
Extended content
|
---|
GIGO "garbage in garbage out" do you meant "Rubbish in rubbish out?" It is no rubbish in to use a url link for a page number. It is not a misuse of the template is is a misuse of the bot. fix please the bot. I have only had a limited time to sample the bots output. Here are some other problems:
This is something generated by the goggle book tool. While it is not a bug to change dash to ndash the correct thing to do if the parameter is These should probably not have been touched:
--PBS (talk) 22:17, 27 August 2017 (UTC) "There is no way for the bot to deal with all the ways that templates can be used wrong" The template is no being used "wrong" do you need help fixeing the bot? -- PBS (talk) 22:20, 27 August 2017 (UTC) BTW I am very consented with the string of edits you made to Murder Act 1751 after I raised problem with the bot. Please explain -- PBS (talk) 22:40, 27 August 2017 (UTC)
AManWithNoPlan, your claim that the
|
I will have a git pull submitted. AManWithNoPlan (talk) 20:38, 5 September 2017 (UTC)
{{resolved}} in Dev AManWithNoPlan (talk) 18:40, 2 October 2017 (UTC)
Linefeeds
Arxiv often has linefeeds in titles and journal names. Need to strip them out and probably replace with a space. AManWithNoPlan (talk) 04:03, 10 September 2017 (UTC)
- Something like find '
\s+
' replace '- I have added code to github to replace
"\n\r","\r\n","\r","\n"
each with a single space (all four are valid depending upon your OS). Once the dev version is updated, I will test it out. AManWithNoPlan (talk) 15:12, 12 September 2017 (UTC)- @AManWithNoPlan: it should strip tabs too. Headbomb {t · c · p · b} 15:38, 12 September 2017 (UTC)
- @Headbomb: $v = preg_replace('/(\s\s+|\t|\n)/', ' ', $v); I think this grabs all of them and all spaces and cuts them down to one space. AManWithNoPlan (talk) 16:10, 12 September 2017 (UTC)
- Wouldn't
\s+
cover all of that though? Headbomb {t · c · p · b} 16:43, 12 September 2017 (UTC)- There you go being right. AManWithNoPlan (talk) 16:52, 12 September 2017 (UTC)
- Wouldn't
- @Headbomb: $v = preg_replace('/(\s\s+|\t|\n)/', ' ', $v); I think this grabs all of them and all spaces and cuts them down to one space. AManWithNoPlan (talk) 16:10, 12 September 2017 (UTC)
- @AManWithNoPlan: it should strip tabs too. Headbomb {t · c · p · b} 15:38, 12 September 2017 (UTC)
- I have added code to github to replace
{{resolved}} in Dev AManWithNoPlan (talk) 18:39, 2 October 2017 (UTC)
First parameter gets deleted
- Status
- new bug
- Reported by
- Martin (Smith609 – Talk) 12:49, 15 September 2017 (UTC)
- Type of bug
- Deleterious
- What happens
- Unnamed parameter code deletes first element of citation templates
- Relevant diffs/links
- Triggered using Citationm buttoin: https://en.wikipedia.org/w/index.php?title=Xiaoheiqingella&type=revision&diff=800750038&oldid=800749961
Triggered using wmftools link: https://en.wikipedia.org/w/index.php?title=Xiaoheiqingella&type=revision&diff=800750114&oldid=800750078
- Replication instructions
- See edits
- We can't proceed until
- Agreement on the best solution
{{Cite journal |STUFF | pp. 1–5}} lead to this: {{Cite journal|pages=1–5}}
protected function correct_param_spelling()
or more likely use_unnamed_params()
in Template.php
is to blame.
- This {{Cite journal|pp. 1–5}} becomes {{Cite journa}} because the bot deletes the first entry. AManWithNoPlan (talk) 19:30, 19 September 2017 (UTC)
{{resolved}} in Dev AManWithNoPlan (talk) 18:39, 2 October 2017 (UTC)
bot adds |year= when |date= already holds valid date
- Status
- new bug
- Reported by
- Trappist the monk (talk) 10:32, 27 September 2017 (UTC)
- Type of bug
- Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
- What happens
- bot added
|year=2002
even though|date=2002
already present in the citation; this adds the page to Category:CS1 maint: Date and year - What should happen
- nothing; the citation was fine without
|year=2002
- Relevant diffs/links
- look for bonehead mistakes
- Replication instructions
- don't know, I wasn't the bot driver; history claims that the bot made this edit autonomously
- We can't proceed until
- Code to be fixed
The article title is funny (I thought you were being funny).. {{cite journal|date=2002|doi=10.1635/0097-3157(2002)152[0215:HPOVBM]2.0.CO;2}} is enough to get the bug. AManWithNoPlan (talk) 13:57, 27 September 2017 (UTC)
- It is the SICI data that was used. https://github.com/ms609/citation-bot/pull/176 AManWithNoPlan (talk) 01:53, 29 September 2017 (UTC)
{{resolved}} in Dev AManWithNoPlan (talk) 18:37, 2 October 2017 (UTC)
Update jstor links
Old links include SICI, they redirect to stable jstor https://www.jstor.org/sici?sici=0003-0279(196101%2F03)81%3A1%3C43%3AWLIMP%3E2.0.CO%3B2-9 Should figure that out and update. AManWithNoPlan (talk) 04:08, 30 September 2017 (UTC) https://github.com/ms609/citation-bot/pull/201 and test later with this:
public function testJstorSICI() {
$text = '{{Cite journal|url=https://www.jstor.org/sici?sici=0003-0279(196101%2F03)81%3A1%3C43%3AWLIMP%3E2.0.CO%3B2-9}}';
$expanded = $this->process_citation($text);
$this-assertEquals('594900',$expanded->get('jstor'));
}
AManWithNoPlan (talk) 04:19, 2 October 2017 (UTC)
{{resolved}} in GitHub. AManWithNoPlan (talk) 02:14, 3 October 2017 (UTC)