Evergreen ILS Website

IRC log for #evergreen, 2023-10-12

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
06:58 collum joined #evergreen
07:11 kworstell-isl joined #evergreen
07:41 BDorsey joined #evergreen
07:56 redavis joined #evergreen
08:01 sandbergja joined #evergreen
08:16 Rogan joined #evergreen
08:38 mmorgan joined #evergreen
08:45 dguarrac joined #evergreen
10:20 Dyrcona joined #evergreen
10:25 Dyrcona jeff: It definitely looks like the slow down in marc_export is caused by the sheer amount of data for the Perl to process. It's taking about the same amount of time without the options that I thought might be slowing things down. I'm going to do smaller batches to see if that helps.
10:29 berick Collaborative code review sessions.. 4:45am Sundays?
10:32 mmorgan berick: 11am Pacific/2pm Eastern on Mondays :) https://wiki.evergreen-ils.org/do​ku.php?id=faqs:evergreen_roadmap
10:32 Dyrcona berick: Nah. On the 22nd, I'll be on my way to the airport for a flight to Indianapolis. :)
10:32 berick was just looking at https://evergreen-ils.org/communicate/calendar/
10:33 Dyrcona berick: I gotcha. Be nice to know what timezone that is.
10:34 * mmorgan didn't even see that!
10:36 Dyrcona Tahiti! I'm in!
10:36 berick woohoo
10:37 Dyrcona UTC-10 is the closest I came up with, maybe it should be UTC-9 or UTC-11. Not sure.
10:39 Stompro Can anyone give me a hint where to find the Legacy undated circs in the database.
10:40 mmorgan Stompro: extend_reporter.legacy_circ_count
10:40 Stompro mmorgan++ thank you!!
10:41 mmorgan YW!
10:43 Stompro Dyrcona, I was conversing with Brian about his marc_export issues.  And I did a test export or our DB.  200K bibs took 5 minutes, used 1.2GB of RAM, and created a 1GB uncompressed xml file, which compressed to 115MB.
10:43 Dyrcona user_ingest_name_keywords_tgr BEFORE INSERT OR UPDATE ON actor.usr FOR EACH ROW EXECUTE PROCEDURE actor.user_ingest_name_keywords() <- I think that should maybe be an AFTER, but I'll play with it.
10:45 Stompro That was without items... I should try it again with items.
10:45 Dyrcona Stompro: I still have to do this in production, but it has always taken longer than that. Are you feeding IDs to marc_export, and what marc_export options are you using?
10:48 Dyrcona Yeah, adding --items is what seems to do it.
10:48 Stompro Feeding ids, only arguments there were EXTRACT_ARGS="--encoding=UTF-8 -f XML"
10:48 briank joined #evergreen
10:49 Dyrcona Also, I realize my comment about the user_ingest_name_keywords_tgr needing to be AFTER is bogus. I pulled an extra couple of fields in my query and discovered why I was seeing what I thought was anomalous.
10:50 kworstell_isl joined #evergreen
10:51 Dyrcona Some test accounts have weird names. :)
10:52 Dyrcona The problem could be the extra query to grab item data combined with the massive amount of data.
10:54 Stompro Dyrcona, --items didn't change the memory usage, still 1.2G for 194432 bibs.. run time seems longer.. I'll report back when done.
10:56 Dyrcona I think running that select for items in a loop is the real issue. I should refactor this to grab the items at the time the query runs. That complicates the main loop though.
11:00 Dyrcona I wonder if pulling out an ARRAY of asset.copy%ROWTYPE is possible, and if so, will that greatly expand the memory use? I think "Yes" on the second point.
11:01 Stompro Dyrcona, Runtime was 10m37s for the --items run of 194432 bibs, 1.2G xml file uncompressed,  124M compressed.  So --items doubled the run time.
11:02 Dyrcona Yeah. I expected something like that. Stompro++
11:03 Dyrcona I'm seeing something like 1,000/minute performance, but I'll wager we have more items.
11:07 Stompro Without --items, marc_export pegs a cpu core, with --items it was only at 9% usage.  So mostly waiting for query results.
11:08 Dyrcona Over time I've seen it hit 98% CPU usage with --items.
11:09 Dyrcona ps -o etime,pcpu,rss,pmem 110013
11:09 Dyrcona ELAPSED %CPU   RSS %MEM
11:09 Dyrcona 23:29:58 97.2 9550376 29.0
11:09 Dyrcona 360853 <- Number of records in the binary MARC file.
11:10 Dyrcona /openils/bin/marc_export --all --items <-- command line
11:12 Dyrcona I wonder what our average copy count per bib is? Probably not that high, though we do 1 bib with over 4,000.
11:13 Dyrcona I estimate it will export about 1.78 million records.
11:14 Stompro Dyrcona, Nevermind about the cpu usage, that was me seeing the 9% memory used by mistake.
11:15 Stompro Using your method I see 67%cpu
11:16 Dyrcona I'm going to try smaller batches in production starting this afternoon to see if it helps. I may or may not stop the one running on a test vm. Maybe it is time to refactor export to speed things up?
11:24 pinesol News from commits: Docs: LP1845957 Permissions List with Descriptions <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=2680ec​a9e4dbaa79b2cd00c7fd3373311b85901c>
11:24 pinesol News from commits: Docs: LP1845957 Part 1 - Update describing_your_people.adoc <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=c7b205​fb604e7827843c7a5ea6542ce02c2f72ef>
11:25 Dyrcona I think I'll break for lunch early and start on that about noon.
11:33 smayo joined #evergreen
11:34 smayo joined #evergreen
11:43 Dyrcona You know what else is slow? Copying a lot of data to a USB stick. This one must be USB 1.1....
11:44 Dyrcona Maybe it is just no good. It's about 7 years old.
11:54 pinesol News from commits: Docs: LP2038448 correction to 3.8 release notes <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=096e64​d5d2a681972da48c7ef22ffedec2f2716c>
11:54 pinesol News from commits: Docs: Rearranged the docs reports image files <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=3c2a99​93b4ec4822b40cfad0bbba0b0b81daccef>
12:03 jihpringle joined #evergreen
12:05 smayo eeevil Do you remember why you added dow_count in evergreen.find_next_open_time()? I've been looking at https://bugs.launchpad.net/evergreen/+bug/1818912 and I think it might be the culprit.
12:05 pinesol Launchpad bug 1818912 in Evergreen "Single Day Emergency Closings Fail to Update Due Dates Correctly" [High,Confirmed] - Assigned to Steven Mayo (stmayo)
12:19 jeffdavis_ joined #evergreen
12:22 pinesol` joined #evergreen
12:23 smayo Perhaps 'do you remember this thing you wrote 5 years ago' is a bit of a long shot
12:24 kmlussier joined #evergreen
12:24 Dyrcona smayo: Sometimes, "Do I remember what I did yesterday?" is a long shot. :)
12:25 kmlussier Hello #evergreen
12:25 Dyrcona Hello, kmlussier!
12:25 mmorgan Hello!
12:25 kmlussier @dessert [someone]
12:25 * pinesol grabs some Key Lime Cheesecake for kmlussier
12:26 kmlussier Thank you pinesol!
12:29 Dyrcona @coffee kmlussier
12:29 * pinesol brews and pours a cup of El Salvador La Montana Pacamara, and sends it sliding down the bar to kmlussier
12:30 Dyrcona To go with the Key Lime Cheesecake.
12:30 kmlussier Dyrcona: Thanks! I can always use a shot of caffeine, especially with my new commuting schedule.
12:35 * jonadab prefers to get his caffeine from dark chocolate, or flavored black tea.
12:35 jonadab Or both.  Both is good.
12:36 Dyrcona :)
12:37 kmlussier jonadab: I'll accept caffeine in any form that it's given to me. :)
12:49 eeevil smayo: I do not recall ... it's been A While (TM) ;) ... I can look, though
12:51 Dyrcona hmm... marc_export should exit when a bad command line option is passed in.
12:56 Dyrcona It's taking a while to export the "large" bibs. It has to be the items query that is slowing things down. These are 90 bibs in our database with > 500 items.
12:57 eeevil @later tell smayo  I do not recall ... it's been A While (TM) ;) ... I can look, though. UPDATE: looks like it was a "just check a week" flag, basically, though the breakout variable is similar (if 15x larger). if skipping dow_count testing makes everything happy, I'm for it.
12:57 pinesol eeevil: The operation succeeded.
12:58 smayo joined #evergreen
12:58 Dyrcona :)
12:59 smayo Huzzah!
13:01 smayo Should have a patch up once I brush up on the etiquette for changes to the sql.
13:07 kmlussier smayo++ eeevil++
13:10 Dyrcona 18 minutes and 54 seconds and only 13 records exported.... Well, I know where I need to look.
13:34 smayo joined #evergreen
13:48 smayo joined #evergreen
13:50 Dyrcona 58 minutes and it is a bit over halfway done with the 90 large records. I have a 'debug' version of marc_export that I'll use to dump the queries on a test system.
13:56 jihpringle joined #evergreen
14:24 pinesol News from commits: Docs: updates to Z39.50 documentation <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=bb4d79​5eb16102c35d26f0d59d14da50b86605e4>
14:54 Dyrcona Doing batches does not appear to have improved performance. If anything, it seems worse, but maybe my dev database is faster than production.
15:03 Dyrcona Looks like I'm getting about 500/minute, but I'll check back after a few more batches have run.
15:03 berick Dyrcona: are running the client in Linux?
15:09 Dyrcona berick: What client? I'm running marc_export on the command line.
15:09 Dyrcona So, yes, it's Linux.
15:09 berick yeah, the client
15:10 berick was curious of you'd be a guinea pig.  i have a limited-functionality rust marc exporter
15:10 berick wonder how it compares
15:10 Dyrcona Well, I was thinking that I would have to implement something just for this, so I'm willing to try your rust exporter.
15:11 berick binary dm'ed
15:11 berick ./eg-marc-export --items --to-xml --out-file /tmp/recs.xml # --help also works / other options to limit the data set
15:12 Dyrcona berick++ I'll give it a look.
15:12 berick Dyrcona++
15:21 jeffdavis Interesting performance problem on our test servers - the Items Out tab is very slow to load. Looks like the call to open-ils.pcrud.search.circ is consistently taking 6.5 seconds per item for some reason.
15:21 berick oof
15:22 jeff do the items in question have extremely high circulation counts?
15:23 jeff we have a test item that gets checked in and out multiple times a day via SIP2 for a Nagios/Zabbix style check.
15:23 jeff I once made the mistake of using that same item to test something unrelated, and it took consistently 6 seconds or more to retrieve in item status.
15:24 jeffdavis No, these are just randomly selected items - the one I checked has 8 total circs.
15:24 jeff (It may not apply in this case. I don't think the pcrud search is going to be trying to get a total circ count for the items.)
15:24 jeff ah, drat.
15:25 jeffdavis Our production environment is not affected, but a test server running the same version of Evergreen (3.9) is, as is a different test server running 3.11.
15:25 mmorgan1 joined #evergreen
15:28 Stompro jeffdavis, are they all on the same Postgres version?
15:29 jeffdavis Yes, all PG14. The test servers all share the same Postgres server, my guess is that's where the issue lies but not sure what the mechanism would be.
15:30 Dyrcona jeffdavis: You have all of the latest patches for Pg installed?
15:30 Dyrcona Meaning Evergreen patches.
15:32 jeffdavis The affected servers are either 3.9.1-ish or 3.11.1-ish with some additional backports -- not fully up to date but pretty close. Are there specific recent patches you're thinking of?
15:36 Dyrcona Yeah, but for some reason I can't find it. I thought it made it into main already.
15:36 Dyrcona it may not affect this either.
15:38 Dyrcona I thought there was something sharpsie found with Pg 12+ a few months ago, but I'm coming up blank.
15:39 Dyrcona Maybe I'm thinking of Lp 1999274, which affects search? (IDK. I've got too much going on.)
15:39 pinesol Launchpad bug 1999274 in Evergreen 3.10 "Performance of Search on PostgreSQL Versions 12+" [Medium,Fix released] https://launchpad.net/bugs/1999274
15:42 jeffdavis we've got that one fortunately
15:44 Rogan joined #evergreen
15:47 Dyrcona berick: evergreen-universe-rs: Does that require the Redis branch, or is that "just" recommended?
15:48 berick Dyrcona: depends on what you're doing with it.  the marc export stuff does not --  it's just database code
15:48 * berick is pushing additions fwiw
15:48 Dyrcona I was thinking of the OpenSRF and Evergreen packages.
15:49 Dyrcona I see commits from 3 minutes ago. Do you have some that are more recent?
15:50 berick 3 mins sounds about right
15:50 berick you only need opensrf/evergreen redis branches if you want to talk to opensrf/evergreen via the opensrf network.
15:51 Dyrcona Thanks! That was what I thought, but looking at one of the examples I thought it might work without redis. berick++
15:51 berick IOW, evergreen-universe-rs does a variety of stuff, but when it talks to EG, it assumes Redis is the communication layer
15:51 berick ah, no, redis required for those actions
15:55 pinesol News from commits: Docs: Circulation Patron Record Page <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=9d632b​3589a263333a187fda59a708fe672f2813>
15:56 Dyrcona If I'm going to start messing with Rust, I guess I should dust off the VM where I tested the RedisRF branches.
15:56 berick muahaha i have successfully distracted you :)
15:57 Dyrcona :)
16:01 Dyrcona If the Rust export is faster, then I won't consider it a distraction. :)
16:10 berick Dyrcona: if you're building on ubuntu 22.04, see: https://github.com/kcls/evergreen-universe-rs/tr​ee/main#ubuntu-2204-2023-10-11-dependency-issue
16:14 Dyrcona berick: Thanks, I will build on 22.04 most likely. Looks like they package the same Rust version for 20.04.
16:15 Dyrcona Also, I did an explain analyze on one of the copy queries and got this: https://explain.depesz.com/s/p7iT#html     It's not terrible, but that sequence scan on cp_cn_idx appears to be a problem, but I don't know how I could make that faster.
16:16 sandbergja joined #evergreen
16:17 sandbergja abneiman++ # such docs, much commit
16:17 abneiman :-D
16:17 abneiman we'll see if I busted the build again, lol
16:17 Dyrcona abneiman++ sandbergja++
16:18 sandbergja Github was able to build them at least!  https://github.com/evergreen-library-sy​stem/Evergreen/actions/runs/6500376854
16:22 Dyrcona The query to look up the asset.call_numbers for the same record that the copy query came from doesn't look too terrible, either: https://explain.depesz.com/s/KRmm  I wonder if I could speed it up by avoiding the join? (Not sure how to do that, either.)
16:24 Dyrcona The main query was undefined when I tried to dump it, so I'll have to try again with some code changes to marc_export.
16:25 pinesol News from commits: LP#2039186 Unable to schedule a report at 8 AM <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=80b333​e8f94fb27d332747e7d0a216054061a995>
16:25 pinesol News from commits: Docs: Reports docs fixes <https://git.evergreen-ils.org/?p=E​vergreen.git;a=commitdiff;h=b67c1b​1e2441e4cda5edfe3f0779583bc7918806>
17:04 mmorgan1 left #evergreen
17:16 Stompro joined #evergreen

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat