IRC log for #evergreen, 2023-10-12

All times shown according to the server's local time.

Time	Nick	Message
06:58		collum joined #evergreen
07:11		kworstell-isl joined #evergreen
07:41		BDorsey joined #evergreen
07:56		redavis joined #evergreen
08:01		sandbergja joined #evergreen
08:16		Rogan joined #evergreen
08:38		mmorgan joined #evergreen
08:45		dguarrac joined #evergreen
10:20		Dyrcona joined #evergreen
10:25	Dyrcona	jeff: It definitely looks like the slow down in marc_export is caused by the sheer amount of data for the Perl to process. It's taking about the same amount of time without the options that I thought might be slowing things down. I'm going to do smaller batches to see if that helps.
10:29	berick	Collaborative code review sessions.. 4:45am Sundays?
10:32	mmorgan	berick: 11am Pacific/2pm Eastern on Mondays :) https://wiki.evergreen-ils.org/doku.php?id=faqs:evergreen_roadmap
10:32	Dyrcona	berick: Nah. On the 22nd, I'll be on my way to the airport for a flight to Indianapolis. :)
10:32	berick	was just looking at https://evergreen-ils.org/communicate/calendar/
10:33	Dyrcona	berick: I gotcha. Be nice to know what timezone that is.
10:34	* mmorgan	didn't even see that!
10:36	Dyrcona	Tahiti! I'm in!
10:36	berick	woohoo
10:37	Dyrcona	UTC-10 is the closest I came up with, maybe it should be UTC-9 or UTC-11. Not sure.
10:39	Stompro	Can anyone give me a hint where to find the Legacy undated circs in the database.
10:40	mmorgan	Stompro: extend_reporter.legacy_circ_count
10:40	Stompro	mmorgan++ thank you!!
10:41	mmorgan	YW!
10:43	Stompro	Dyrcona, I was conversing with Brian about his marc_export issues. And I did a test export or our DB. 200K bibs took 5 minutes, used 1.2GB of RAM, and created a 1GB uncompressed xml file, which compressed to 115MB.
10:43	Dyrcona	user_ingest_name_keywords_tgr BEFORE INSERT OR UPDATE ON actor.usr FOR EACH ROW EXECUTE PROCEDURE actor.user_ingest_name_keywords() <- I think that should maybe be an AFTER, but I'll play with it.
10:45	Stompro	That was without items... I should try it again with items.
10:45	Dyrcona	Stompro: I still have to do this in production, but it has always taken longer than that. Are you feeding IDs to marc_export, and what marc_export options are you using?
10:48	Dyrcona	Yeah, adding --items is what seems to do it.
10:48	Stompro	Feeding ids, only arguments there were EXTRACT_ARGS="--encoding=UTF-8 -f XML"
10:48		briank joined #evergreen
10:49	Dyrcona	Also, I realize my comment about the user_ingest_name_keywords_tgr needing to be AFTER is bogus. I pulled an extra couple of fields in my query and discovered why I was seeing what I thought was anomalous.
10:50		kworstell_isl joined #evergreen
10:51	Dyrcona	Some test accounts have weird names. :)
10:52	Dyrcona	The problem could be the extra query to grab item data combined with the massive amount of data.
10:54	Stompro	Dyrcona, --items didn't change the memory usage, still 1.2G for 194432 bibs.. run time seems longer.. I'll report back when done.
10:56	Dyrcona	I think running that select for items in a loop is the real issue. I should refactor this to grab the items at the time the query runs. That complicates the main loop though.
11:00	Dyrcona	I wonder if pulling out an ARRAY of asset.copy%ROWTYPE is possible, and if so, will that greatly expand the memory use? I think "Yes" on the second point.
11:01	Stompro	Dyrcona, Runtime was 10m37s for the --items run of 194432 bibs, 1.2G xml file uncompressed, 124M compressed. So --items doubled the run time.
11:02	Dyrcona	Yeah. I expected something like that. Stompro++
11:03	Dyrcona	I'm seeing something like 1,000/minute performance, but I'll wager we have more items.
11:07	Stompro	Without --items, marc_export pegs a cpu core, with --items it was only at 9% usage. So mostly waiting for query results.
11:08	Dyrcona	Over time I've seen it hit 98% CPU usage with --items.
11:09	Dyrcona	ps -o etime,pcpu,rss,pmem 110013
11:09	Dyrcona	ELAPSED %CPU RSS %MEM
11:09	Dyrcona	23:29:58 97.2 9550376 29.0
11:09	Dyrcona	360853 <- Number of records in the binary MARC file.
11:10	Dyrcona	/openils/bin/marc_export --all --items <-- command line
11:12	Dyrcona	I wonder what our average copy count per bib is? Probably not that high, though we do 1 bib with over 4,000.
11:13	Dyrcona	I estimate it will export about 1.78 million records.
11:14	Stompro	Dyrcona, Nevermind about the cpu usage, that was me seeing the 9% memory used by mistake.
11:15	Stompro	Using your method I see 67%cpu
11:16	Dyrcona	I'm going to try smaller batches in production starting this afternoon to see if it helps. I may or may not stop the one running on a test vm. Maybe it is time to refactor export to speed things up?
11:24	pinesol	News from commits: Docs: LP1845957 Permissions List with Descriptions <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=2680eca9e4dbaa79b2cd00c7fd3373311b85901c>
11:24	pinesol	News from commits: Docs: LP1845957 Part 1 - Update describing_your_people.adoc <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=c7b205fb604e7827843c7a5ea6542ce02c2f72ef>
11:25	Dyrcona	I think I'll break for lunch early and start on that about noon.
11:33		smayo joined #evergreen
11:34		smayo joined #evergreen
11:43	Dyrcona	You know what else is slow? Copying a lot of data to a USB stick. This one must be USB 1.1....
11:44	Dyrcona	Maybe it is just no good. It's about 7 years old.
11:54	pinesol	News from commits: Docs: LP2038448 correction to 3.8 release notes <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=096e64d5d2a681972da48c7ef22ffedec2f2716c>
11:54	pinesol	News from commits: Docs: Rearranged the docs reports image files <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=3c2a9993b4ec4822b40cfad0bbba0b0b81daccef>
12:03		jihpringle joined #evergreen
12:05	smayo	eeevil Do you remember why you added dow_count in evergreen.find_next_open_time()? I've been looking at https://bugs.launchpad.net/evergreen/+bug/1818912 and I think it might be the culprit.
12:05	pinesol	Launchpad bug 1818912 in Evergreen "Single Day Emergency Closings Fail to Update Due Dates Correctly" [High,Confirmed] - Assigned to Steven Mayo (stmayo)
12:19		jeffdavis_ joined #evergreen
12:22		pinesol` joined #evergreen
12:23	smayo	Perhaps 'do you remember this thing you wrote 5 years ago' is a bit of a long shot
12:24		kmlussier joined #evergreen
12:24	Dyrcona	smayo: Sometimes, "Do I remember what I did yesterday?" is a long shot. :)
12:25	kmlussier	Hello #evergreen
12:25	Dyrcona	Hello, kmlussier!
12:25	mmorgan	Hello!
12:25	kmlussier	@dessert [someone]
12:25	* pinesol	grabs some Key Lime Cheesecake for kmlussier
12:26	kmlussier	Thank you pinesol!
12:29	Dyrcona	@coffee kmlussier
12:29	* pinesol	brews and pours a cup of El Salvador La Montana Pacamara, and sends it sliding down the bar to kmlussier
12:30	Dyrcona	To go with the Key Lime Cheesecake.
12:30	kmlussier	Dyrcona: Thanks! I can always use a shot of caffeine, especially with my new commuting schedule.
12:35	* jonadab	prefers to get his caffeine from dark chocolate, or flavored black tea.
12:35	jonadab	Or both. Both is good.
12:36	Dyrcona	:)
12:37	kmlussier	jonadab: I'll accept caffeine in any form that it's given to me. :)
12:49	eeevil	smayo: I do not recall ... it's been A While (TM) ;) ... I can look, though
12:51	Dyrcona	hmm... marc_export should exit when a bad command line option is passed in.
12:56	Dyrcona	It's taking a while to export the "large" bibs. It has to be the items query that is slowing things down. These are 90 bibs in our database with > 500 items.
12:57	eeevil	@later tell smayo I do not recall ... it's been A While (TM) ;) ... I can look, though. UPDATE: looks like it was a "just check a week" flag, basically, though the breakout variable is similar (if 15x larger). if skipping dow_count testing makes everything happy, I'm for it.
12:57	pinesol	eeevil: The operation succeeded.
12:58		smayo joined #evergreen
12:58	Dyrcona	:)
12:59	smayo	Huzzah!
13:01	smayo	Should have a patch up once I brush up on the etiquette for changes to the sql.
13:07	kmlussier	smayo++ eeevil++
13:10	Dyrcona	18 minutes and 54 seconds and only 13 records exported.... Well, I know where I need to look.
13:34		smayo joined #evergreen
13:48		smayo joined #evergreen
13:50	Dyrcona	58 minutes and it is a bit over halfway done with the 90 large records. I have a 'debug' version of marc_export that I'll use to dump the queries on a test system.
13:56		jihpringle joined #evergreen
14:24	pinesol	News from commits: Docs: updates to Z39.50 documentation <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=bb4d795eb16102c35d26f0d59d14da50b86605e4>
14:54	Dyrcona	Doing batches does not appear to have improved performance. If anything, it seems worse, but maybe my dev database is faster than production.
15:03	Dyrcona	Looks like I'm getting about 500/minute, but I'll check back after a few more batches have run.
15:03	berick	Dyrcona: are running the client in Linux?
15:09	Dyrcona	berick: What client? I'm running marc_export on the command line.
15:09	Dyrcona	So, yes, it's Linux.
15:09	berick	yeah, the client
15:10	berick	was curious of you'd be a guinea pig. i have a limited-functionality rust marc exporter
15:10	berick	wonder how it compares
15:10	Dyrcona	Well, I was thinking that I would have to implement something just for this, so I'm willing to try your rust exporter.
15:11	berick	binary dm'ed
15:11	berick	./eg-marc-export --items --to-xml --out-file /tmp/recs.xml # --help also works / other options to limit the data set
15:12	Dyrcona	berick++ I'll give it a look.
15:12	berick	Dyrcona++
15:21	jeffdavis	Interesting performance problem on our test servers - the Items Out tab is very slow to load. Looks like the call to open-ils.pcrud.search.circ is consistently taking 6.5 seconds per item for some reason.
15:21	berick	oof
15:22	jeff	do the items in question have extremely high circulation counts?
15:23	jeff	we have a test item that gets checked in and out multiple times a day via SIP2 for a Nagios/Zabbix style check.
15:23	jeff	I once made the mistake of using that same item to test something unrelated, and it took consistently 6 seconds or more to retrieve in item status.
15:24	jeffdavis	No, these are just randomly selected items - the one I checked has 8 total circs.
15:24	jeff	(It may not apply in this case. I don't think the pcrud search is going to be trying to get a total circ count for the items.)
15:24	jeff	ah, drat.
15:25	jeffdavis	Our production environment is not affected, but a test server running the same version of Evergreen (3.9) is, as is a different test server running 3.11.
15:25		mmorgan1 joined #evergreen
15:28	Stompro	jeffdavis, are they all on the same Postgres version?
15:29	jeffdavis	Yes, all PG14. The test servers all share the same Postgres server, my guess is that's where the issue lies but not sure what the mechanism would be.
15:30	Dyrcona	jeffdavis: You have all of the latest patches for Pg installed?
15:30	Dyrcona	Meaning Evergreen patches.
15:32	jeffdavis	The affected servers are either 3.9.1-ish or 3.11.1-ish with some additional backports -- not fully up to date but pretty close. Are there specific recent patches you're thinking of?
15:36	Dyrcona	Yeah, but for some reason I can't find it. I thought it made it into main already.
15:36	Dyrcona	it may not affect this either.
15:38	Dyrcona	I thought there was something sharpsie found with Pg 12+ a few months ago, but I'm coming up blank.
15:39	Dyrcona	Maybe I'm thinking of Lp 1999274, which affects search? (IDK. I've got too much going on.)
15:39	pinesol	Launchpad bug 1999274 in Evergreen 3.10 "Performance of Search on PostgreSQL Versions 12+" [Medium,Fix released] https://launchpad.net/bugs/1999274
15:42	jeffdavis	we've got that one fortunately
15:44		Rogan joined #evergreen
15:47	Dyrcona	berick: evergreen-universe-rs: Does that require the Redis branch, or is that "just" recommended?
15:48	berick	Dyrcona: depends on what you're doing with it. the marc export stuff does not -- it's just database code
15:48	* berick	is pushing additions fwiw
15:48	Dyrcona	I was thinking of the OpenSRF and Evergreen packages.
15:49	Dyrcona	I see commits from 3 minutes ago. Do you have some that are more recent?
15:50	berick	3 mins sounds about right
15:50	berick	you only need opensrf/evergreen redis branches if you want to talk to opensrf/evergreen via the opensrf network.
15:51	Dyrcona	Thanks! That was what I thought, but looking at one of the examples I thought it might work without redis. berick++
15:51	berick	IOW, evergreen-universe-rs does a variety of stuff, but when it talks to EG, it assumes Redis is the communication layer
15:51	berick	ah, no, redis required for those actions
15:55	pinesol	News from commits: Docs: Circulation Patron Record Page <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=9d632b3589a263333a187fda59a708fe672f2813>
15:56	Dyrcona	If I'm going to start messing with Rust, I guess I should dust off the VM where I tested the RedisRF branches.
15:56	berick	muahaha i have successfully distracted you :)
15:57	Dyrcona	:)
16:01	Dyrcona	If the Rust export is faster, then I won't consider it a distraction. :)
16:10	berick	Dyrcona: if you're building on ubuntu 22.04, see: https://github.com/kcls/evergreen-universe-rs/tree/main#ubuntu-2204-2023-10-11-dependency-issue
16:14	Dyrcona	berick: Thanks, I will build on 22.04 most likely. Looks like they package the same Rust version for 20.04.
16:15	Dyrcona	Also, I did an explain analyze on one of the copy queries and got this: https://explain.depesz.com/s/p7iT#html It's not terrible, but that sequence scan on cp_cn_idx appears to be a problem, but I don't know how I could make that faster.
16:16		sandbergja joined #evergreen
16:17	sandbergja	abneiman++ # such docs, much commit
16:17	abneiman	:-D
16:17	abneiman	we'll see if I busted the build again, lol
16:17	Dyrcona	abneiman++ sandbergja++
16:18	sandbergja	Github was able to build them at least! https://github.com/evergreen-library-system/Evergreen/actions/runs/6500376854
16:22	Dyrcona	The query to look up the asset.call_numbers for the same record that the copy query came from doesn't look too terrible, either: https://explain.depesz.com/s/KRmm I wonder if I could speed it up by avoiding the join? (Not sure how to do that, either.)
16:24	Dyrcona	The main query was undefined when I tried to dump it, so I'll have to try again with some code changes to marc_export.
16:25	pinesol	News from commits: LP#2039186 Unable to schedule a report at 8 AM <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=80b333e8f94fb27d332747e7d0a216054061a995>
16:25	pinesol	News from commits: Docs: Reports docs fixes <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=b67c1b1e2441e4cda5edfe3f0779583bc7918806>
17:04		mmorgan1 left #evergreen
17:16		Stompro joined #evergreen