IRC log for #evergreen, 2017-10-24

All times shown according to the server's local time.

Time	Nick	Message
03:48		Jillianne joined #evergreen
06:00	pinesol_green	News from qatests: Test Success <http://testing.evergreen-ils.org/~live>
06:01		rlefaive joined #evergreen
07:19		rlefaive joined #evergreen
08:35		mmorgan joined #evergreen
08:52		kmlussier joined #evergreen
08:56		bos20k joined #evergreen
09:02		rlefaive_ joined #evergreen
09:04		jvwoolf joined #evergreen
09:15		Dyrcona joined #evergreen
09:16	Dyrcona	So, our '7 Day Card Expiration Courtesy Notice' hasn't worked since Feb. 23, 2017 when we had some trouble with the utility server that runs the cron job.
09:17	Dyrcona	I have checked and the necessary action_trigger_runner has been running.
09:17	Dyrcona	Is there anything else that anyone can think of that I should look at?
09:19	Dyrcona	I also just ran it manually and nothing happened. No event were added to action_trigger.event.
09:19	Dyrcona	Yes, the event_definition is set to active.
09:27	Dyrcona	Also, I seem to have a 2 second time out showing up in the back end somewhere.
09:28	Dyrcona	I've been looking into another issues since the upgrade of a report template folder not loading.
09:28	Dyrcona	If it can retrieve the list of templates in less than 2 seconds, they all display.
09:29	Dyrcona	If it takes more than 2 seconds, nothing displays and I get this in the logs: osrf_json_gw 2017-10-23 16:32:53 [INFO:30632:osrf_app_session.c:394:1508754310306324] Returning NULL from app_request_recv after timeout: open-ils.reporter.folder_data.retrieve [parameters redacted]
09:32	Dyrcona	I've also noticed that some times in the log show up in the local timezone and others are UTC.
09:32	Dyrcona	Even from the same server.
09:38	Dyrcona	Also, how com booking requests are being made on a server not running the booking service?
09:39	Dyrcona	s/com/come/
09:39	* Dyrcona	is full of questions today.
09:41	mmorgan	Dyrcona: Our card expiration courtesy notices have been running fine, here's our event_def:
09:42	pastebot	"mmorgan" at 64.57.241.14 pasted "Expiration courtesy notice" (22 lines) at http://paste.evergreen-ils.org/900
09:43	Dyrcona	I don't think our event def has changed since before it stopped working.
09:45	Dyrcona	I also think it must be related to a missing file, perhaps an at filter, because it stopped working around the time the server had a hardware failure and had to be temporarily replaced, and then was later put back into service.
09:45	Dyrcona	And, it was completely wiped, and everything reinstalled, after the failure.
09:45	Dyrcona	I don't have exact records of the dates, so I don't know if it worked on the temporary replacement or not.
09:46	Dyrcona	That was a very hectic time with multiple issues going on.
09:47	Dyrcona	And, this is great. osrferror log points to a SQL error, but that timestamp in the pg log shows no error. Guess the message didn't make it to syslog server. I'll have to check on the db server to make sure.
09:48	Dyrcona	Just a warning that there is no transaction in progress.
09:56	pastebot	"mmorgan" at 64.57.241.14 pasted "FWIW, our a_t_filter" (11 lines) at http://paste.evergreen-ils.org/901
10:01		mmorgan1 joined #evergreen
10:06		jvwoolf1 joined #evergreen
10:56		collum joined #evergreen
10:58	Dyrcona	mmorgan1: Thanks for the suggestions.
10:58	Dyrcona	I've added a filter, but I'm still not getting any output and the log messages that I see don't help.
10:59	Dyrcona	I also can't be certain which are actually from this and which aren't.
11:39		sandbergja joined #evergreen
11:53		_adb joined #evergreen
11:59		mmorgan joined #evergreen
12:12		khuckins_ joined #evergreen
13:13	dbs	I've fixed a number of the PDF & ePub build breaks over the years; might be a good idea to run automated builds and notify in channel when the docs are busted.
13:13	dbs	Maybe even as part of the regular automated build?
13:24	* Dyrcona	is getting nowhere on this a/t problem.
13:25	bshum	dbs: That sounds like a good enhancement idea. We need to update the installers anyways since they're built atop Wheezy, and we should move that to newer platform
13:25	bshum	Maybe a Hack-A-Way agenda item or task to poke at
13:25	bshum	I'll add it to the list of ideas
13:28	Dyrcona	Now that I have a filter in place, the a/t runner just runs, seemingly doing nothing.
13:28	Dyrcona	No log messages to osrfsys.log, nothing in the PG logs...
13:28	dbs	bshum++ good idea
13:28	Dyrcona	All connections to the db server are idle.
13:29	Dyrcona	Although this did pop up: 2017-10-24 13:25:40 EDT [24470-1] evergreenpostgres LOG: could not receive data from client: Connection timed out
13:30	Dyrcona	But, that's the wrong database, so never mind.
13:32	Bmagic	So, I changed the marc template file that is referenced in opensrf.xml-> open-ils.cat -> app_settings -> marctemplates. I am not seeing my changes in the staff client. I restarted osrf_control --service=open-ils.cat --restart --localhost - no dice
13:33	Bmagic	Do I have to restart all of th eservices?
13:37	* mmorgan	generally restarts all services for marc template changes. Not sure it's necessary, but it gets the job done.
13:37	Dyrcona	You have to restart opensrf.settings also, then restart open-ils.cat, and you have to quit the staff client and log back in, IIRC.
13:37	jeff	Bmagic: You'd need to restart opensrf.settings first, before restarting open-ils.cat--- yeah, what Dyrcona just said. :P
13:37	Dyrcona	You only need to restart opensrf.settings and open-ils.cat.
13:37	Bmagic	ah! thanks
13:38	jeff	services don't read the config file from disk, they request it from opensrf.settings.
13:44	Bmagic	I see
13:44	Bmagic	it didn't work but whatever, I will just wait until midnight and restart everything
13:45	Dyrcona	Bmagic: Did you quit the staff client and sign in again?
13:45	Bmagic	yep
13:45	Dyrcona	The sequence is:
13:45	Bmagic	The change I made was to the already-existing xml template file
13:45	Dyrcona	1. Put the new templates and opensrf.xml in place.
13:45	Dyrcona	2. Restart opensrf.settings
13:45	Dyrcona	3. Restart open-ils.cat
13:46	Bmagic	4. close staff client and sign-in again
13:46	Dyrcona	Well, you should close the staff client to begin with, but...
13:47	Bmagic	I did exactly that. For exactness.... I used these commands for steps 2 and 3
13:47	Dyrcona	And you did that on all of the bricks?
13:47	Bmagic	osrf_control --service=opensrf.settings --restart --localhost
13:47	Bmagic	osrf_control --service=open-ils.cat --restart --localhost
13:47	Dyrcona	I don't usually use --localhost, but thats a matter of configuration.
13:47	Bmagic	the feedback on the bash prompt mentioned pid ID's and positive feedback
13:47	Bmagic	perhaps it needs the --stop then followed by --start?
13:48	mmorgan	If the change is to an existing file, are service restarts required? Maybe just clear the client cache?
13:48	Dyrcona	No, restart should be enough.
13:48	Bmagic	I was thinking client cache was a possible issue
13:48	Dyrcona	Maybe.
13:49	Dyrcona	We don't usually change existing templates without also adding a new one or doing more things besides.
13:50	Bmagic	I would like to add that I did include two new templates in opensrf.xml while I was add it
13:50	Bmagic	add/at
13:50	mmorgan	Do the new templates show up?
13:50	Bmagic	they sure dont
13:51	Bmagic	It's no big deal
13:51	Bmagic	I assume restarting everything will do the trick
13:58	Dyrcona	Bmagic: Is there an error in opensrf.xml?
14:03		jvwoolf1 left #evergreen
14:09		jvwoolf joined #evergreen
14:10	Dyrcona	[2017-10-24 18:03:48] open-ils.trigger [ERR :1679:EX.pm:66:] Exception: OpenSRF::EX::JabberDisconnected 2017-10-24T18:03:48 OpenSRF::Transport::SlimJabber::Client /usr/local/share/perl/5.22.1/OpenSRF/Transport/SlimJabber/Client.pm:199 JabberDisconnected Exception: This JabberClient instance is no longer connected to the server
14:11	Dyrcona	Note the timestamp... Also, why?
14:14	Dyrcona	And, the a/t runner is still running, apparently doing nothing.
14:14	Dyrcona	A bunch of events were created according to the logs, but apparently never committed to the database.
14:14		jvwoolf1 joined #evergreen
14:16	pastebot	"Dyrcona" at 64.57.241.14 pasted "open-ils.trigger disconnection" (15 lines) at http://paste.evergreen-ils.org/902
14:17	Dyrcona	No apparent reason for the disconnect.
14:34	pastebot	"Dyrcona" at 64.57.241.14 pasted "Another try with the 30-day expire notification" (45 lines) at http://paste.evergreen-ils.org/903
14:35	Dyrcona	I think I should Lp this, but I'm not sure it isn't just us.
14:35	Dyrcona	Looking into report folders not loading all templates yesterday, I see something that is suspiciously similar, but there it looks like there's a 2 second time out somewhere that I can't find.
14:38	berick	Dyrcona: ejabberd logs have nothing to offer?
14:39	berick	that's a lot of unceremonious disconnecting
14:45	pastebot	"Dyrcona" at 64.57.241.14 pasted "ejabberd log" (11 lines) at http://paste.evergreen-ils.org/904
14:54	berick	huh. just gone.
14:55	berick	and in the case of pid 1679, it's creating a tiny object (action_trigger.event)
14:58	Dyrcona	I see similar things in production, too.
15:03		mmorgan1 joined #evergreen
15:11	Dyrcona	Here we go with loglevel set to debug.
15:17	Dyrcona	Well, if the stupid VPN would cooperate.
15:18	Bmagic	Dyrcona: sorry, had to step away. I don't think there is an error in that xml file. I had wondered that myself, but I only added two lines by copying a line above and repeating it twice, then changing the values required.
15:35	pastebot	"Dyrcona" at 64.57.241.14 pasted "With loglevel degug doesn't help much." (23 lines) at http://paste.evergreen-ils.org/905
15:37	Dyrcona	It sends the message to cstore, then immediately disconnects claiming to have received no response.
15:37	berick	Dyrcona: does this always happen after a certain number of events? near the beginning? end?
15:37	Dyrcona	I wonder if the timestamps being different has anything to do with it?
15:38	Dyrcona	berick: I don't know for sure. I have about 2.8MB of log file from when I started the action trigger runner until this disconnect occurs if you want to see it.
15:39	berick	if there are no other ERR's, probalby not much to learn there. mostly curious if there's a predictable pattern
15:40	Dyrcona	Well, every time I run this particular a/t event, this happens. :)
15:40	berick	heh, a pattern emerges
15:42	Dyrcona	I have about 329 of these before the lines I pasted: [2017-10-24 19:25:12] open-ils.trigger [DEBG:2526:AppSession.pm:539:] AppSession sending RESULT to opensrfprivate.localhost/client_at_jasontest.cwmars.org_2581 with threadTrace [3]
15:44	Dyrcona	678 That match the 2526 pid and "Received"
15:45	Dyrcona	I suspect I'm hitting 1,000 messages and it's dying, like you suspected in conversations with miker.
15:49	berick	Dyrcona: and you're seeing numerous calls to open-ils.cstore.direct.action_trigger.event.create -- and no committed data to reflect these calls?
15:51		mmorgan joined #evergreen
15:51	berick	Dyrcona: 1,000 message, is that a reference to max_requests?
15:51	Dyrcona	Yes, 1,648 to be exact.
15:52	berick	some of those may be dupes, but that's still a lot of events to create w/o a commit.
15:52	Dyrcona	Yes, some are dupes, looks like up to 3 lines or so per actual create call.
15:54	berick	huh, i thought it committed after every event create, but at a glance it commits after they are all created
15:54	berick	so, i guess those numbers are not surprising
15:55	Dyrcona	Looks like 329 actual attempts to create the events.
15:58	berick	backing up a bit, this would not be affected by opensrf.xml max_requests. that applies to router-level requests (not inside a transaction / drone session). and it's graceful, exiting after communication ends.
15:59	* berick	wonders if ejabberd has a new "you talk to much, I'm kicking you off" setting
16:01	Dyrcona	It looks like voluntary, though, like the trigger drone decided to disconnect on its own.
16:02	berick	it wouldn't log that as an error
16:02	berick	it was kicked off
16:03	Dyrcona	And, there's nothing in the ejabberd.log to that effect.
16:03	berick	yeah, that's frustrating
16:05		khuckins__ joined #evergreen
16:05	Dyrcona	Nothing in ejabberd.yml, either.
16:06	Dyrcona	No setting other than max_offline_messages that looks remotely close.
16:08	Dyrcona	If it was sending more than 500,000 bytes/sec, would the shaper have kicked it out?
16:09	Dyrcona	Doesn't look like it...
16:12	berick	nah, just slowed it down
16:12	berick	and the offline stuff should be disabled, fwiw
16:12	berick	(though some settings may not have to be)
16:15	* miker	appears out of the mist
16:17	miker	so, this looks almost exactly like bug 1704396
16:17	pinesol_green	Launchpad bug 1704396 in Evergreen "Slowness for metarecord and one-hit searches in 2.12" [High,Confirmed] https://launchpad.net/bugs/1704396
16:21	miker	what happens there is that a search backend makes a request, usually to storage but sometimes to circ, and then while (or before) the response arrives, the search backend goes away
16:22	berick	miker: does it go away after being disconnected from jabber?
16:22	Dyrcona	berick: Yeah, offline is disabled.
16:22	* berick	hasn't followed too closely
16:22	miker	I'm currently looking into the "during" option, where parsing the response in the sax parser fails while the ejabberd server is sending more packets, and the client-side dies a terrible death
16:22	Dyrcona	berick: I see where it's going to pull 615 patrons data, but only 329 messages are later sent.
16:22	miker	berick: the search backend (client side) just disappears from the logs after making the request
16:23	Dyrcona	If I eliminate the patrons that do not have email, there are 449.
16:24	miker	of course, it could be that every bug looks like a nail right now ;)
16:24	Dyrcona	miker: Well, you can squish bugs with a hammer. ;)
16:25	miker	that's the hope, yes :)
16:26	berick	well that's it, then. every bug boils down to 1704396, but to fix it, every bug must be individually repaired.
16:26	mmorgan	Dyrcona's trigger is consistently failing, the metarecord search issue is inconsistent.
16:26	berick	no backsies
16:27	* mmorgan	would think that if it were the same issue, there might have been some successful action triggers since February.
16:28	miker	mmorgan: you know, you're absolutely right, and it is correct that I'm trying to squash that staple-bug with my nail-bug hammer
16:28	miker	everyone forget I was here today!
16:29	Dyrcona	yeah, I know it's intermittent.
16:30	miker	(and, for less silliness, the difference in the evidence between them is that Dyrcona's trigger drone logs its disconnection with ejabberd, my search backends simply disappear from the face of the earth)
16:30	miker	sorry for the noise, in any case
16:32	Dyrcona	miker: You tried loglevel 4?
16:32	miker	Dyrcona: you read my mind. I just bumped up to debug, and will wait for something to happen...
16:33	miker	and hope the drive space holds out until then...
16:34	miker	and further hope that the problem isn't a tight race condition that's masked by the extra logging time of DEBUG
16:34	Dyrcona	Yeah, really. Almost 14MB just from starting up services.
17:02	* miker	selectively restarts just the services to inspect
17:06		mmorgan left #evergreen
17:13		Dyrcona joined #evergreen
17:20		Jillianne joined #evergreen
17:51	Bmagic	I say we spend the Hack-a-way building this https://youtu.be/_fTC_Ud_k3U
17:57		jvwoolf joined #evergreen
18:01	pinesol_green	News from qatests: Test Success <http://testing.evergreen-ils.org/~live>
18:04		jvwoolf left #evergreen
18:18	gsams	I'm seeing an incredibly quickly repeated error in ap_error.log files that starts " CGI::param called in list context from package Template::Provider line 1, this can lead to vulnerabilities."
18:18	gsams	This is leading to some massive files.
18:19	gsams	And I'm afraid I'm not really sure where to begin.
18:20	gsams	"CGI::param called in list context from package Template::Provider line 1, this can lead to vulnerabilities. See the warning in "Fetching the value or values of a single named parameter" at /usr/share/perl5/CGI.pm line 436."
18:20	gsams	Appears to be the full line
18:24	jeffdavis	gsams: I'm not seeing that. What EG/Apache/Linux version are you using?
18:28	gsams	jeffdavis: 2.11.3/2.4.10/Debian Jessie
18:30		jihpringle joined #evergreen
18:31	gsams	looks like the past 4 days have suddenly ballooned our errors by quite a bit... daily directories are suddenly roughly 9G in size from 100-350M
18:33	jeffdavis	What version of the TT2 perl packages do you have? dpkg -s libtemplate-perl
18:33	gsams	2.24-1.2+b1
18:33	gsams	I believe
18:34	gsams	I believe I have found why the sizes are so different, the problem has apparently been there all along, we just ran out of space for the files to zip.
18:34	gsams	which I'll be correcting shortly.
18:34	gsams	still an underlying problem somewhere I'd like to correct of course.
18:37	jeffdavis	we're on libtemplate-perl 2.24-1.1 (EG 2.12 on Ubuntu 14.04)
19:34		sandbergja joined #evergreen
20:01		kakious joined #evergreen
20:02	kakious	Hello! Anyone here able to help me with an issue with creating a new staff member? I created one and granted System Administrative Privileges but they can not log in via the staff client.
20:07	jihpringle	kakious: it sounds like they're missing a permission - maybe STAFF_LOGIN
20:07	jihpringle	Admin -> Server Administration -> Permission Groups
20:07	kakious	They do have that user permission enabled
20:08	jihpringle	I'd recommend posting your questions to the general evergreen list https://evergreen-ils.org/communicate/mailing-lists/
20:09	jihpringle	or asking it again between 9:00am and 5:00pm Eastern Time (that's when this channel is most active)
20:09	_adb	can the user log into the opac?
20:10	jihpringle	I have to head out now, good luck
20:10	kakious	No, The can not log in via web or staff client
20:11	kakious	They*
20:11		kakious_ joined #evergreen
20:11	_adb	my first guess is that it's something mundane. try resetting the password to something easy to type.
20:12	kakious	No Result.
20:13	_adb	can anyone log in? is it just this one user?
20:13		kakious_ joined #evergreen
20:31	rhamby	kakious: do they have a working org unit assigned?
20:56		yboston_ joined #evergreen
20:58		_bott_ joined #evergreen
21:04		sandbergja joined #evergreen