IRC log for #evergreen, 2022-01-27

All times shown according to the server's local time.

Time	Nick	Message
04:15		JBoyer_ joined #evergreen
04:51		rjackson_isl_hom joined #evergreen
05:30		alynn26 joined #evergreen
05:39		gmcharlt joined #evergreen
06:02	pinesol	News from qatests: Testing Success <http://testing.evergreen-ils.org/~live>
08:11		mantis1 joined #evergreen
08:30		Dyrcona joined #evergreen
08:38		mmorgan joined #evergreen
09:00		rfrasur joined #evergreen
09:26		jvwoolf joined #evergreen
09:26	Dyrcona	Went and made a custom type just returning ROW would probably work.
09:31	Dyrcona	Hmm. Missing a conjunction there.... Or, RECORD maybe...
10:36		awitter joined #evergreen
10:48	csharp_	I'm still scouring opensrf/ejabberd logs and I'm not finding anything obvious
10:48	csharp_	it's like the client dies off and no one cares, log-wise
10:48	csharp_	not sure where to start adding debug to perl/C whatever
10:49	csharp_	you would think if ejabberd was under some sort of duress it would be showing error log messages
10:49	csharp_	I think "timeout" is a symptom, not a cause
10:50	csharp_	(as in 2022-01-20 11:12:01.958 [info] <0.28199.1>@ejabberd_c2s:process_terminated:271 (tcp\|<0.28199.1>) Closing c2s session for opensrfprivate.brick01-head.gapines.org/open-ils.actor_listener_brick01-head.gapines.org_118630: Connection failed: timeout )
10:50	Dyrcona	Time to add more bricks?
10:50	csharp_	and looking at the code, "just deactivate new cataloging UIs" is not as easy as I thought it might be
10:50	Dyrcona	Never is.
10:51	csharp_	my last of Angular/AngJS foo is working against me there too
10:51	Dyrcona	Have some lasagna, by the way we only have spaghetti noodles.
10:51	csharp_	:-/
10:51	* csharp_	irons the noodles in hopes they get flat enough
10:52	Dyrcona	I'm getting reports that searching OCLC via Z39.50 is not working. I look in the logs and see "search returned 0 hits" and no errors. Anyone else getting similar reports from libraries?
10:55	csharp_	this doesn't feel like a networking threshold issue either - it's random
10:55	csharp_	and it only happens during the workday
10:56	Dyrcona	It still feels like a resource limit to me, maybe not one you can adjust.
10:56	csharp_	I liked the old days when I didn't have to give a sh*t about ejabberd :-(
10:56	Dyrcona	I see lines like this, and I wonder if that is what was really meant: my $count = $$res{count} = $results->size;
10:58	csharp_	if it's a resource limit, would I see something at 7:30 p.m.? 2022-01-26 19:33:06.051 [info] <0.6239.0>@ejabberd_c2s:process_terminated:271 (tcp\|<0.6239.0>) Closing c2s session for opensrfprivate.brick05-head.gapines.org/open-ils.actor_listener_brick05-head.gapines.org_92370: Connection failed: timeout
10:59	Dyrcona	Yeah, maybe. It's hard to say.
10:59	Dyrcona	I know of lot of things don't get logged by any part of the chain that I wish were logged.
11:00	Dyrcona	s/of/a/
11:00	csharp_	I've thought that for a long time and it's really kicking my ass right now
11:01	csharp_	too many of the less useful messages, not enough of the useful ones (of course, that's pretty subjective and contextual)
11:01	Dyrcona	Yeah.
11:02	Dyrcona	I'm just grepping logs for things remotely related to the OCLC situation, and I see lots of bad input in the error logs. :)
11:02	Dyrcona	Not for the OCLC/Z39.50, though, but for other searches and some circ calls.
11:03	Dyrcona	This isn't going to match an ISBN: *&0062993151
11:04	Dyrcona	Some of it looks like someone trying to fuzz one of the gateways.
11:09	csharp_	@blame THE FUZZ
11:09	pinesol	csharp_: THE FUZZ WILL PERISH UNDER MAXIMUM DELETION! DELETE. DELETE. DELETE!
11:10	Dyrcona	In my case, I don't see any signs of a problem, just that OCLC searches are returning 0 results. Again, this is basically useless, though not as serious.
11:14	Dyrcona	csharp_: Maybe it is time to ditch ejabberd for something else?
11:16	berick	i'll be proposing a conf. session on just that topic
11:16	alynn26	+1 for ditching ejabberd
11:17	berick	(which obv. won't help csharp_ in the short term)
11:17	Dyrcona	berick: You have something specific in mind?
11:17	berick	Dyrcona: i have a proof of concept using Redis
11:20	Dyrcona	berick: Adding a new Transport, or is it more complicated than that?
11:22	csharp_	berick: interesting
11:22	berick	Dyrcona: adding a new transport, while also scrapping the xmpp xml wrapper, which requires a few additional changes, but not huge changes.
11:23	Dyrcona	Yes, definitely sounds interesting.
11:24	berick	well, other changes too. i'm still experimenting and writing up the proposal
11:25	Dyrcona	berick++
11:26	csharp_	berick++
11:26	csharp_	well, the PINES starr are talking about halting cataloging for a day to rule those out as possible causes - feels extreme to me, but without much to go on from the logging end, I'm getting desperate
11:27	csharp_	er.. s/starr/staff/
11:28	berick	csharp_: like Dyrcona asked, have you considered adding a server or two to help spread the load until this can be resolved? seems like one of those things that can't hurt...
11:31	csharp_	berick: no, I haven't really considered that - given the fact that the servers don't appear to be under any sort of duress it might be adding to my babysitting duties
11:35	berick	csharp_: yeah, i get that. just thinking if the problem only happens during the workday, more load seems at least partially to blame
11:35	berick	load that could be spread
11:37	csharp_	interesting - I'm thinking about that
11:52	Dyrcona	Is Mercury in retrograde?
11:54	rjackson_isl_hom	quick question to see if this rings a bell and any helpful hints on how to resolve: db server replaced overnight due to hardware issues - up and running but seeing errors when translate_isbn013 is called and error indicates can't locate Business/ISBN.pm - which is part of the db function
11:54	rjackson_isl_hom	is this a path from within the postgres install on the db server that needs adjusted, or ???
11:55	Dyrcona	rjackson_isl_hom: You probably need to install the db server prerequisites.
11:55	rjackson_isl_hom	actual error looks like this
11:55	rjackson_isl_hom	DBD::Pg::st execute failed: ERROR: Can't locate Business/ISBN.pm in @INC (you may need to install the Business::ISBN module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/share/perl/5.26.1 /usr/lib/x86_64-linux-gnu/perl5/5.26 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.26 /usr/share/perl/5.26 /usr/local/lib/site_perl) at line 2.\nBEGIN failed--compilation aborted at line
11:55	rjackson_isl_hom	2.\nCONTEXT: compilation of PL/Perl function "translate_isbn1013" [for Statement " -- bib search:
11:57	rjackson_isl_hom	Dyrcona I am back seat driver but trying to assist if I can! What are the prerequisites that might be missing and can this be fixed post db load? System is "up" relatively speaking
11:59	Dyrcona	rjackson_isl_hom: Look in the Open-ILS/extras/install/Makefile.<distro>-<release> for your Evergreen version, distro, and release. You need to install the packages listed for DEB_PGSQL_COMMON_MODS and the cpan modules from CPAN_MODULES_PGSQL.
12:00	rjackson_isl_hom	OK thanks I will pass that on Dyrcona++
12:03	Dyrcona	rjackson_isl_hom: I omitted a level in the path before. It is supposed to be Open-ILS/src/extras/install/Makefiel.<distro>-<release>
12:03	Dyrcona	typos--
12:03	Dyrcona	Makefile.....
12:03	Dyrcona	Anyway, back to the mystery of why Vandelay is suddenly slow today.....
12:04		jihpringle joined #evergreen
12:25	Dyrcona	@dunno
12:25	pinesol	Dyrcona: No, you're a puzzleheaded kraken!
12:32		nfBurton joined #evergreen
12:36	jeffdavis	No need to get personal, pinesol.
12:48	jeffdavis	It would be interesting to see if that same PINES problem exists with a different version of ejabberd (e.g. by putting an Ubuntu 20.04 server in rotation temporarily - might be a lot of work for low risk of reward though)
12:50		abowling joined #evergreen
12:57	csharp_	jeffdavis: that occurred to me too
13:02	rjackson_isl_hom	still pounding head to wall - we have app servers that did not change and the ISBN.pm module is installed there
13:03	Dyrcona	rjackson_isl_hom: You need to install the things that I mentioned on the database server. It needs those Perl modules.
13:04	rjackson_isl_hom	OK - looking further
13:08	jeff	csharp_: have you captured network traffic (specifically / especially XMPP traffic) during one of the events?
13:11		jihpringle joined #evergreen
13:15	Dyrcona	I'm also being told that bib records won't overlay. Anything useful in the logs? Doesn't look like it so far.
13:16	Dyrcona	There is way too much "noise" in the logs.
13:18		awitter joined #evergreen
13:24	Dyrcona	Ah. Read the new ticket more carefully, and it's the same as the other OCLC not working ticket. They just worded it as overlays not working because the user intended to overlay with records from OCLC.
14:12		jvwoolf joined #evergreen
14:13		jvwoolf1 joined #evergreen
14:15	Dyrcona	phasefx: If were' using the new Stripe API, should Stripe.pm come into play? I'm getting reports of internal server errors when people try to pay with a credit card, and I'm seeing "Can't connect to api" errors from Stripe.pm.
14:15	Dyrcona	It has been one of those days.
14:16	Dyrcona	Since all of the problems seem to be related to us talking to outside vendors, I'm starting to think something is wrong with the network at the colocation facility.
14:22	rfrasur	one_of_those_days--
14:50	csharp_	jeff: not yet
14:50	csharp_	I was experimenting with tshark earlier (TCP on port 5222 listening on "any" or "lo")
14:53	Dyrcona	Well, don't reboot your load balancer, ever.... :)
14:54	Dyrcona	My networking issues were caused by iptables rules not loading when the load balancer was rebooted last night. I think an update obliterated the script that was loading the rules.
14:55		rfrasur joined #evergreen
14:55	csharp_	@praise The Rules
14:55	* pinesol	The Rules is very kind and good-looking and always does what's best for the project
14:58	csharp_	https://www.reddit.com/r/ProgrammerHumor/comments/sdhsaf/programming/ - saw this last night and it hit home
14:58	csharp_	still in the I HATE PROGRAMMING stage of grief
15:01	Dyrcona	That gets me right in the feels. :)
15:30	csharp_	sigh - with tshark running I'm not seeing the issue
15:31	csharp_	could be coincidence, but I haven't seen an event in 30 mins
15:38	csharp_	ha! got one
15:38	csharp_	...and immediately another
15:42	Dyrcona	Any clue what the problem is?
16:20		jvwoolf1 left #evergreen
17:11		mmorgan left #evergreen
18:00	pinesol	News from qatests: Testing Success <http://testing.evergreen-ils.org/~live>