Time |
Nick |
Message |
06:58 |
|
collum joined #evergreen |
07:11 |
|
kworstell-isl joined #evergreen |
07:41 |
|
BDorsey joined #evergreen |
07:56 |
|
redavis joined #evergreen |
08:01 |
|
sandbergja joined #evergreen |
08:16 |
|
Rogan joined #evergreen |
08:38 |
|
mmorgan joined #evergreen |
08:45 |
|
dguarrac joined #evergreen |
10:20 |
|
Dyrcona joined #evergreen |
10:25 |
Dyrcona |
jeff: It definitely looks like the slow down in marc_export is caused by the sheer amount of data for the Perl to process. It's taking about the same amount of time without the options that I thought might be slowing things down. I'm going to do smaller batches to see if that helps. |
10:29 |
berick |
Collaborative code review sessions.. 4:45am Sundays? |
10:32 |
mmorgan |
berick: 11am Pacific/2pm Eastern on Mondays :) https://wiki.evergreen-ils.org/doku.php?id=faqs:evergreen_roadmap |
10:32 |
Dyrcona |
berick: Nah. On the 22nd, I'll be on my way to the airport for a flight to Indianapolis. :) |
10:32 |
berick |
was just looking at https://evergreen-ils.org/communicate/calendar/ |
10:33 |
Dyrcona |
berick: I gotcha. Be nice to know what timezone that is. |
10:34 |
* mmorgan |
didn't even see that! |
10:36 |
Dyrcona |
Tahiti! I'm in! |
10:36 |
berick |
woohoo |
10:37 |
Dyrcona |
UTC-10 is the closest I came up with, maybe it should be UTC-9 or UTC-11. Not sure. |
10:39 |
Stompro |
Can anyone give me a hint where to find the Legacy undated circs in the database. |
10:40 |
mmorgan |
Stompro: extend_reporter.legacy_circ_count |
10:40 |
Stompro |
mmorgan++ thank you!! |
10:41 |
mmorgan |
YW! |
10:43 |
Stompro |
Dyrcona, I was conversing with Brian about his marc_export issues. And I did a test export or our DB. 200K bibs took 5 minutes, used 1.2GB of RAM, and created a 1GB uncompressed xml file, which compressed to 115MB. |
10:43 |
Dyrcona |
user_ingest_name_keywords_tgr BEFORE INSERT OR UPDATE ON actor.usr FOR EACH ROW EXECUTE PROCEDURE actor.user_ingest_name_keywords() <- I think that should maybe be an AFTER, but I'll play with it. |
10:45 |
Stompro |
That was without items... I should try it again with items. |
10:45 |
Dyrcona |
Stompro: I still have to do this in production, but it has always taken longer than that. Are you feeding IDs to marc_export, and what marc_export options are you using? |
10:48 |
Dyrcona |
Yeah, adding --items is what seems to do it. |
10:48 |
Stompro |
Feeding ids, only arguments there were EXTRACT_ARGS="--encoding=UTF-8 -f XML" |
10:48 |
|
briank joined #evergreen |
10:49 |
Dyrcona |
Also, I realize my comment about the user_ingest_name_keywords_tgr needing to be AFTER is bogus. I pulled an extra couple of fields in my query and discovered why I was seeing what I thought was anomalous. |
10:50 |
|
kworstell_isl joined #evergreen |
10:51 |
Dyrcona |
Some test accounts have weird names. :) |
10:52 |
Dyrcona |
The problem could be the extra query to grab item data combined with the massive amount of data. |
10:54 |
Stompro |
Dyrcona, --items didn't change the memory usage, still 1.2G for 194432 bibs.. run time seems longer.. I'll report back when done. |
10:56 |
Dyrcona |
I think running that select for items in a loop is the real issue. I should refactor this to grab the items at the time the query runs. That complicates the main loop though. |
11:00 |
Dyrcona |
I wonder if pulling out an ARRAY of asset.copy%ROWTYPE is possible, and if so, will that greatly expand the memory use? I think "Yes" on the second point. |
11:01 |
Stompro |
Dyrcona, Runtime was 10m37s for the --items run of 194432 bibs, 1.2G xml file uncompressed, 124M compressed. So --items doubled the run time. |
11:02 |
Dyrcona |
Yeah. I expected something like that. Stompro++ |
11:03 |
Dyrcona |
I'm seeing something like 1,000/minute performance, but I'll wager we have more items. |
11:07 |
Stompro |
Without --items, marc_export pegs a cpu core, with --items it was only at 9% usage. So mostly waiting for query results. |
11:08 |
Dyrcona |
Over time I've seen it hit 98% CPU usage with --items. |
11:09 |
Dyrcona |
ps -o etime,pcpu,rss,pmem 110013 |
11:09 |
Dyrcona |
ELAPSED %CPU RSS %MEM |
11:09 |
Dyrcona |
23:29:58 97.2 9550376 29.0 |
11:09 |
Dyrcona |
360853 <- Number of records in the binary MARC file. |
11:10 |
Dyrcona |
/openils/bin/marc_export --all --items <-- command line |
11:12 |
Dyrcona |
I wonder what our average copy count per bib is? Probably not that high, though we do 1 bib with over 4,000. |
11:13 |
Dyrcona |
I estimate it will export about 1.78 million records. |
11:14 |
Stompro |
Dyrcona, Nevermind about the cpu usage, that was me seeing the 9% memory used by mistake. |
11:15 |
Stompro |
Using your method I see 67%cpu |
11:16 |
Dyrcona |
I'm going to try smaller batches in production starting this afternoon to see if it helps. I may or may not stop the one running on a test vm. Maybe it is time to refactor export to speed things up? |
11:24 |
pinesol |
News from commits: Docs: LP1845957 Permissions List with Descriptions <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=2680eca9e4dbaa79b2cd00c7fd3373311b85901c> |
11:24 |
pinesol |
News from commits: Docs: LP1845957 Part 1 - Update describing_your_people.adoc <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=c7b205fb604e7827843c7a5ea6542ce02c2f72ef> |
11:25 |
Dyrcona |
I think I'll break for lunch early and start on that about noon. |
11:33 |
|
smayo joined #evergreen |
11:34 |
|
smayo joined #evergreen |
11:43 |
Dyrcona |
You know what else is slow? Copying a lot of data to a USB stick. This one must be USB 1.1.... |
11:44 |
Dyrcona |
Maybe it is just no good. It's about 7 years old. |
11:54 |
pinesol |
News from commits: Docs: LP2038448 correction to 3.8 release notes <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=096e64d5d2a681972da48c7ef22ffedec2f2716c> |
11:54 |
pinesol |
News from commits: Docs: Rearranged the docs reports image files <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=3c2a9993b4ec4822b40cfad0bbba0b0b81daccef> |
12:03 |
|
jihpringle joined #evergreen |
12:05 |
smayo |
eeevil Do you remember why you added dow_count in evergreen.find_next_open_time()? I've been looking at https://bugs.launchpad.net/evergreen/+bug/1818912 and I think it might be the culprit. |
12:05 |
pinesol |
Launchpad bug 1818912 in Evergreen "Single Day Emergency Closings Fail to Update Due Dates Correctly" [High,Confirmed] - Assigned to Steven Mayo (stmayo) |
12:19 |
|
jeffdavis_ joined #evergreen |
12:22 |
|
pinesol` joined #evergreen |
12:23 |
smayo |
Perhaps 'do you remember this thing you wrote 5 years ago' is a bit of a long shot |
12:24 |
|
kmlussier joined #evergreen |
12:24 |
Dyrcona |
smayo: Sometimes, "Do I remember what I did yesterday?" is a long shot. :) |
12:25 |
kmlussier |
Hello #evergreen |
12:25 |
Dyrcona |
Hello, kmlussier! |
12:25 |
mmorgan |
Hello! |
12:25 |
kmlussier |
@dessert [someone] |
12:25 |
* pinesol |
grabs some Key Lime Cheesecake for kmlussier |
12:26 |
kmlussier |
Thank you pinesol! |
12:29 |
Dyrcona |
@coffee kmlussier |
12:29 |
* pinesol |
brews and pours a cup of El Salvador La Montana Pacamara, and sends it sliding down the bar to kmlussier |
12:30 |
Dyrcona |
To go with the Key Lime Cheesecake. |
12:30 |
kmlussier |
Dyrcona: Thanks! I can always use a shot of caffeine, especially with my new commuting schedule. |
12:35 |
* jonadab |
prefers to get his caffeine from dark chocolate, or flavored black tea. |
12:35 |
jonadab |
Or both. Both is good. |
12:36 |
Dyrcona |
:) |
12:37 |
kmlussier |
jonadab: I'll accept caffeine in any form that it's given to me. :) |
12:49 |
eeevil |
smayo: I do not recall ... it's been A While (TM) ;) ... I can look, though |
12:51 |
Dyrcona |
hmm... marc_export should exit when a bad command line option is passed in. |
12:56 |
Dyrcona |
It's taking a while to export the "large" bibs. It has to be the items query that is slowing things down. These are 90 bibs in our database with > 500 items. |
12:57 |
eeevil |
@later tell smayo I do not recall ... it's been A While (TM) ;) ... I can look, though. UPDATE: looks like it was a "just check a week" flag, basically, though the breakout variable is similar (if 15x larger). if skipping dow_count testing makes everything happy, I'm for it. |
12:57 |
pinesol |
eeevil: The operation succeeded. |
12:58 |
|
smayo joined #evergreen |
12:58 |
Dyrcona |
:) |
12:59 |
smayo |
Huzzah! |
13:01 |
smayo |
Should have a patch up once I brush up on the etiquette for changes to the sql. |
13:07 |
kmlussier |
smayo++ eeevil++ |
13:10 |
Dyrcona |
18 minutes and 54 seconds and only 13 records exported.... Well, I know where I need to look. |
13:34 |
|
smayo joined #evergreen |
13:48 |
|
smayo joined #evergreen |
13:50 |
Dyrcona |
58 minutes and it is a bit over halfway done with the 90 large records. I have a 'debug' version of marc_export that I'll use to dump the queries on a test system. |
13:56 |
|
jihpringle joined #evergreen |
14:24 |
pinesol |
News from commits: Docs: updates to Z39.50 documentation <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=bb4d795eb16102c35d26f0d59d14da50b86605e4> |
14:54 |
Dyrcona |
Doing batches does not appear to have improved performance. If anything, it seems worse, but maybe my dev database is faster than production. |
15:03 |
Dyrcona |
Looks like I'm getting about 500/minute, but I'll check back after a few more batches have run. |
15:03 |
berick |
Dyrcona: are running the client in Linux? |
15:09 |
Dyrcona |
berick: What client? I'm running marc_export on the command line. |
15:09 |
Dyrcona |
So, yes, it's Linux. |
15:09 |
berick |
yeah, the client |
15:10 |
berick |
was curious of you'd be a guinea pig. i have a limited-functionality rust marc exporter |
15:10 |
berick |
wonder how it compares |
15:10 |
Dyrcona |
Well, I was thinking that I would have to implement something just for this, so I'm willing to try your rust exporter. |
15:11 |
berick |
binary dm'ed |
15:11 |
berick |
./eg-marc-export --items --to-xml --out-file /tmp/recs.xml # --help also works / other options to limit the data set |
15:12 |
Dyrcona |
berick++ I'll give it a look. |
15:12 |
berick |
Dyrcona++ |
15:21 |
jeffdavis |
Interesting performance problem on our test servers - the Items Out tab is very slow to load. Looks like the call to open-ils.pcrud.search.circ is consistently taking 6.5 seconds per item for some reason. |
15:21 |
berick |
oof |
15:22 |
jeff |
do the items in question have extremely high circulation counts? |
15:23 |
jeff |
we have a test item that gets checked in and out multiple times a day via SIP2 for a Nagios/Zabbix style check. |
15:23 |
jeff |
I once made the mistake of using that same item to test something unrelated, and it took consistently 6 seconds or more to retrieve in item status. |
15:24 |
jeffdavis |
No, these are just randomly selected items - the one I checked has 8 total circs. |
15:24 |
jeff |
(It may not apply in this case. I don't think the pcrud search is going to be trying to get a total circ count for the items.) |
15:24 |
jeff |
ah, drat. |
15:25 |
jeffdavis |
Our production environment is not affected, but a test server running the same version of Evergreen (3.9) is, as is a different test server running 3.11. |
15:25 |
|
mmorgan1 joined #evergreen |
15:28 |
Stompro |
jeffdavis, are they all on the same Postgres version? |
15:29 |
jeffdavis |
Yes, all PG14. The test servers all share the same Postgres server, my guess is that's where the issue lies but not sure what the mechanism would be. |
15:30 |
Dyrcona |
jeffdavis: You have all of the latest patches for Pg installed? |
15:30 |
Dyrcona |
Meaning Evergreen patches. |
15:32 |
jeffdavis |
The affected servers are either 3.9.1-ish or 3.11.1-ish with some additional backports -- not fully up to date but pretty close. Are there specific recent patches you're thinking of? |
15:36 |
Dyrcona |
Yeah, but for some reason I can't find it. I thought it made it into main already. |
15:36 |
Dyrcona |
it may not affect this either. |
15:38 |
Dyrcona |
I thought there was something sharpsie found with Pg 12+ a few months ago, but I'm coming up blank. |
15:39 |
Dyrcona |
Maybe I'm thinking of Lp 1999274, which affects search? (IDK. I've got too much going on.) |
15:39 |
pinesol |
Launchpad bug 1999274 in Evergreen 3.10 "Performance of Search on PostgreSQL Versions 12+" [Medium,Fix released] https://launchpad.net/bugs/1999274 |
15:42 |
jeffdavis |
we've got that one fortunately |
15:44 |
|
Rogan joined #evergreen |
15:47 |
Dyrcona |
berick: evergreen-universe-rs: Does that require the Redis branch, or is that "just" recommended? |
15:48 |
berick |
Dyrcona: depends on what you're doing with it. the marc export stuff does not -- it's just database code |
15:48 |
* berick |
is pushing additions fwiw |
15:48 |
Dyrcona |
I was thinking of the OpenSRF and Evergreen packages. |
15:49 |
Dyrcona |
I see commits from 3 minutes ago. Do you have some that are more recent? |
15:50 |
berick |
3 mins sounds about right |
15:50 |
berick |
you only need opensrf/evergreen redis branches if you want to talk to opensrf/evergreen via the opensrf network. |
15:51 |
Dyrcona |
Thanks! That was what I thought, but looking at one of the examples I thought it might work without redis. berick++ |
15:51 |
berick |
IOW, evergreen-universe-rs does a variety of stuff, but when it talks to EG, it assumes Redis is the communication layer |
15:51 |
berick |
ah, no, redis required for those actions |
15:55 |
pinesol |
News from commits: Docs: Circulation Patron Record Page <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=9d632b3589a263333a187fda59a708fe672f2813> |
15:56 |
Dyrcona |
If I'm going to start messing with Rust, I guess I should dust off the VM where I tested the RedisRF branches. |
15:56 |
berick |
muahaha i have successfully distracted you :) |
15:57 |
Dyrcona |
:) |
16:01 |
Dyrcona |
If the Rust export is faster, then I won't consider it a distraction. :) |
16:10 |
berick |
Dyrcona: if you're building on ubuntu 22.04, see: https://github.com/kcls/evergreen-universe-rs/tree/main#ubuntu-2204-2023-10-11-dependency-issue |
16:14 |
Dyrcona |
berick: Thanks, I will build on 22.04 most likely. Looks like they package the same Rust version for 20.04. |
16:15 |
Dyrcona |
Also, I did an explain analyze on one of the copy queries and got this: https://explain.depesz.com/s/p7iT#html It's not terrible, but that sequence scan on cp_cn_idx appears to be a problem, but I don't know how I could make that faster. |
16:16 |
|
sandbergja joined #evergreen |
16:17 |
sandbergja |
abneiman++ # such docs, much commit |
16:17 |
abneiman |
:-D |
16:17 |
abneiman |
we'll see if I busted the build again, lol |
16:17 |
Dyrcona |
abneiman++ sandbergja++ |
16:18 |
sandbergja |
Github was able to build them at least! https://github.com/evergreen-library-system/Evergreen/actions/runs/6500376854 |
16:22 |
Dyrcona |
The query to look up the asset.call_numbers for the same record that the copy query came from doesn't look too terrible, either: https://explain.depesz.com/s/KRmm I wonder if I could speed it up by avoiding the join? (Not sure how to do that, either.) |
16:24 |
Dyrcona |
The main query was undefined when I tried to dump it, so I'll have to try again with some code changes to marc_export. |
16:25 |
pinesol |
News from commits: LP#2039186 Unable to schedule a report at 8 AM <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=80b333e8f94fb27d332747e7d0a216054061a995> |
16:25 |
pinesol |
News from commits: Docs: Reports docs fixes <https://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=b67c1b1e2441e4cda5edfe3f0779583bc7918806> |
17:04 |
|
mmorgan1 left #evergreen |
17:16 |
|
Stompro joined #evergreen |