IRC log for #evergreen, 2023-09-19

All times shown according to the server's local time.

Time	Nick	Message
03:35		kworstell-isl joined #evergreen
07:48		sandbergja joined #evergreen
08:05		BDorsey joined #evergreen
08:21		kworstell-isl joined #evergreen
08:27	sharpsie	@later tell Stompro our B&T attributes set for acq: https://pastebin.com/nzeZtYD1
08:27	pinesol	sharpsie: The operation succeeded.
08:28		kworstell-isl joined #evergreen
08:33		mmorgan joined #evergreen
09:10		Dyrcona joined #evergreen
09:15	Dyrcona	So, the action_trigger_runner blew up processing the modified autorenew events in my test last night. There were 55,706 of them.
09:17	Dyrcona	Most of the events last night (i.e. all of the events that run) are collected or collecting. Only a few event completed. There was nothing else going on in the database, only this one vm running this test, so it's easy to overwhelm action trigger.
09:19	Dyrcona	Time for some spelunking in the logs.
09:21	Bmagic	why is authority_control_fields.pl so expensive?
09:21	Dyrcona	The open-ils.trgger stderr log has a new one on me: Caught error from 'run' method: Unable to update event state at /usr/local/share/perl/5.34.0/OpenILS/Application/Trigger/Event.pm line 247. That might have something to do with a not connected to the network message a few errors higher up. Too bad the stderr log lacks timestamps.
09:22	Dyrcona	Bmagic: Authorities are expensive.
09:22	Dyrcona	All of the links to other authorities and bibs can be expensive to maintain.
09:22	Bmagic	I'm dealing with two of them dogpiling because the first one didn't finish the day before
09:23	Dyrcona	Well, I don't think they're meant to be run regularly. Just once in a while and then the triggers take care of things.
09:24	Bmagic	This isn't the first time of course. It's a nightly thing, gathering up the bibs from the last 24 hours and running the linker
09:24	Dyrcona	Yeah, don't do that. Database triggers take care of new bibs added normally.
09:24	Bmagic	oh really, huh
09:25	Bmagic	so what's the current recommendtation with regards to those two auth perl scripts? no need anymore?
09:25	Dyrcona	You only really need to run the linker script after a big authority load or when first setting up. Running it quarterly, monthly, or weekly, is probably OK too.
09:26	Dyrcona	We run it once per quarter along with our authority updates from Backstage Library Works.
09:26	Bmagic	and when you run it quarterly for examply, you only run it over the bibs that were added in that quarter?
09:26	Dyrcona	Yes.
09:26	Bmagic	that's gotta take days to finish
09:26	Dyrcona	No.
09:26	Dyrcona	A few hours.
09:27	Bmagic	then something is different about this system
09:27	Dyrcona	Tried a vaccuum analyze?
09:27	Bmagic	it spends like 4 seconds per bib ID
09:28	Bmagic	the db gets that over the whole db every sunday
09:28	Bmagic	maybe that should be nightly
09:28	Dyrcona	It can take 2 to 4 seconds to update a bib because of all the triggers.
09:29	Dyrcona	I mean update a bib in general.
09:30	Dyrcona	What we do is in our MOBIUS utilities repo, you have access to it, Bmagic, if you want to look. I've shared a generic version somewhere before, probably on pastebin, too.
09:30	Bmagic	right
09:31	Dyrcona	We basically rely on the EOLI migration tools staged_bib_overlay to run the authority linker scripts for us.
09:32	Dyrcona	I'm pretty sure it only runs on the updated bibs and authorities. Maybe only the new ones.
09:37	Dyrcona	We have had authorities that wouldn't update from the client in the past because it took more than 6 seconds for the update to complete, so the connection timed out and the transaction rolled back. Jackie Chan is one that I remember. I could do the update via the database just fine.
09:37	Dyrcona	That was because of the linking triggers, IIRC.
09:42	Bmagic	gotta watch out, those ninjas will getcha
09:43	Dyrcona	Heh.
09:43	Bmagic	even when they're database rows
09:45	* Dyrcona	starts a playlist. "Do it Again" from Steely Dan seems appropriate music for figuring out the proper order to put a TLS certificate bundle together.
09:46	Bmagic	haha, yep. perfect.
09:46	Dyrcona	Bmagic: You'll get a ticket later today. :)
09:46	Bmagic	I love tickets!
09:46	Dyrcona	I wish the vendor would put the bundle together, but whatever.
09:46	Bmagic	vendors--
09:46	Bmagic	tickets++
09:47	Dyrcona	It's not difficult, just annoying.... I suppose I could inspect the almost expired bundle to see what order those are in.
09:47	Bmagic	I just guess and check :)
09:48	Bmagic	cat cert1 cert2 >> final.crt .... nope, cat cert2 cert1 >> final.crt .... much better
09:51	Dyrcona	Yeah, but I've got 4 total certs for the chain file.
09:56	Dyrcona	I suppose I can figure it out by inspecting the certs. Our cert goes first, then I think they go in order from the one that sigend our cert down to the root.
09:57	Dyrcona	And, I can omit the self-signed root authority assuming that the recipient has it anyway.
10:01	Dyrcona	Right. Easy-peasy... Now to test it.
10:06	jeff	for intermediate certs, apache (starting with 2.4.8) is leaf-to-root, nginx needs the leaf to be first but may tolerate a different order on the others (but why complicate things -- leaf-to-root works here too). pound uses leaf to root followed by the private key in the same PEM file. in most cases (with the possible exclusion of some weird cross-signing), the self-signed actual root cert found in the
10:06	jeff	browser's trust store doesn't need to be included in the file or sent.
10:07	jeff	exim and dovecot and friends i usually look in the docs, or follow whatever my deployment scripts or existing files contain. :-)
10:07	Dyrcona	exim and dovecot work with whatever certbot does. I can say that. Oh, so does Apache 2.4.
10:08	jeff	happily, with intermediates being more or less a fact of life now, you rarely have to go digging in the source or rely on empirical test-and-hope techniques... since they're so common, they're much more well-documented now. :-)
10:08	Dyrcona	This one is just nginx an apache, and I'm getting a gateway timeout.
10:08	jeff	I am happy to keep certbot far, far away from as many systems as possible.
10:08	Dyrcona	Well, the RFC also specifies the order, and most software follows the RFC.
10:09	Dyrcona	I use certbot at home and only update when I get the email. I have to do the challenges manually because my DNS provider doesn't have an API.
10:09	Dyrcona	Or, didn't last time I asked them about it. :)
10:10	Dyrcona	An an internal server error. Hmmm. Let me try a VM that I know was working with the old cert. This one might not be set up right.
10:12	jeff	happily, DNS-01 verification supports CNAME based delegation of challenge RRs, so if you have access to a provider that DOES have an API that's supported by your preferred ACME client, as long as your ACME client supports updating a different set of records as well, you can automate your renewals without needing to change the hosting of the entire zone in question.
10:12	Dyrcona	I've got a few days to fix this. The old cert doesn't expire until next week.
10:13	Dyrcona	Yeah. I do a wildcard for 3 domains, so it's like 6 updates. Not terribly onerous.
10:14	Dyrcona	Hey. Kernel update for this other VM. Might as well install it and reboot.
10:15	Dyrcona	So, I'm going to try the old cert first, then switch it out. Maybe I botched something, but a bad cert. chain doesn't usually lead to an internal server error.
10:17	Dyrcona	Yeah, new cert. works. I guess something else is wrong on the other virtual machine. I probably missed something in the Apache configuration.
10:18	jeff	if the 5xx error is being generated by a proxy and you're using https for the proxy-to-backend communication, your backend service may have unexpectedly transitioned to a certificate type that your proxy or its ssl libs are not capable of handling. we've seen that before, most recently with EC256 certs.
10:19	jeff	(I'm not suggesting that's the issue here, just the first instance of a certificate-related internal server error / 5xx status)
10:19	Dyrcona	I still suspect I missed something, but I'll check the cert types again.
10:19	jeff	(...that came to mind) [remember to finish sentences when possible, jeff:-P]
10:21	Dyrcona	:)
10:22	Dyrcona	I understood, and nope, not EC certs.
10:23	Dyrcona	I'll check the logs for errors later. It's probably something obvious.
10:36	Dyrcona	cert works on our training server...
10:39	Dyrcona	Speaking of "using https for the proxy-to-backend communication," we probably shouldn't given that our instructions typically run nginx and apache on the same host. We should use HTTP between the two to improve the performance.
10:39	Dyrcona	I have been meaning to experiment with that.
10:41	Dyrcona	OK. On the trigger issue: I ran out of trigger drones.....
10:44	Dyrcona	Heh. 'no children available and backlog queue at limit'
10:44	Dyrcona	I'm still trying to figure out what I did with Apache.
10:45	Bmagic	Dyrcona: that would be nice (non-ssl comms for nginx-apache), and I've always assumed that the only thing we'd need to do is remove the perl line "return $self->redirect_ssl unless $self->cgi->https;"
10:47	Dyrcona	Bmagic: There might be a couple of other things, but yeah, there's at least that 1 line in the code, and some configuration.
10:48	Dyrcona	I'd probably coincide this with having nginx redirect HTTP to HTTPS or even not list on port 80 at all.
10:48	Dyrcona	s/(list)/\1en/
10:55	Dyrcona	Y'know. I'm starting to suspect that the 500 Ineternal Sever Error was caused by the backlog. It wasn't just open-ils.trigger that was overloaded.
10:56	Dyrcona	Yeah. It's fine now that I've restarted services.
10:57	Dyrcona	So, my test failed successfully. I now know that I need to increase resources before trying this again. :)
10:58		briank joined #evergreen
11:35	sharpsie	@dunno add SYSTEM CONTROL RESTART OPEN SURF
11:35	pinesol	sharpsie: Error: You must be registered to use this command. If you are already registered, you must either identify (using the identify command) or add a hostmask matching your current hostmask (using the "hostmask add" command).
11:36	sharpsie	@dunno add SYSTEM CONTROL RESTART OPEN SURF
11:36	pinesol	sharpsie: The operation succeeded. Dunno #80 added.
11:37	abneiman	Bmagic: can you please see if the docs built last night? I pushed some big changes on behalf of DIG & I'm not seeing them after several hard refreshes.
11:47	Bmagic	abneiman doesn't look like it. the file timestamps are the day before at 2am. I'll see what the error is
11:47	abneiman	Bmagic++
11:47	abneiman	thank you!
11:49	Bmagic	error: Page alias cannot reference an existing page: latestdocs:reports:reporter_running_recurring_reports.adoc (specified as: reporter_running_recurring_reports.adoc)
11:49		bgillap joined #evergreen
11:52	abneiman	gah
11:52	abneiman	that's on me
11:52	abneiman	I will push a follow up
11:52	abneiman	Thanks!
11:53	abneiman	Bmagic++ # again
11:53	Bmagic	no worries!
11:54	abneiman	honestly, given the amount of changes I put in, I'm a little relieved that there was only the one error!
11:54	Bmagic	great job!
11:55	mmorgan	abneiman++
11:55	Bmagic	abneiman++
12:01	Dyrcona	abneiman++
12:04	abneiman	DIG++ # did most of the work!
12:07		jihpringle joined #evergreen
12:53		mantis1 joined #evergreen
13:35	abneiman	sandbergja++ # adoc assistance
13:35	abneiman	Bmagic: I pushed a followup, fingers crossed for a clean build tonight!
13:49	Dyrcona	JBoyer: Following up on a private conversation from last week: Do you know if EOLI has any marc_export patches dealing with record size? If not, I'm curious how you handle the "export oversize MARC records in XML" for Aspen.
13:58	pinesol	News from commits: Docs: followup commits to Reports docs <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=e7b4f2d7d4479f1f409a4baed68a5edc9541f9ae>
14:51	* Dyrcona	applies all the patches to marc_export.
15:30		mantis1 left #evergreen
15:32		jihpringle joined #evergreen
16:29		Stompro joined #evergreen
16:32	Stompro	sharpsie, thanks for the B&T attributes... just logged back in. pinesol++
17:07		mmorgan left #evergreen
18:27		jihpringle joined #evergreen
18:54		sandbergja left #evergreen
19:42		jihpringle joined #evergreen