Evergreen ILS Website

IRC log for #evergreen, 2022-12-08

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
06:48 JBoyer joined #evergreen
07:34 kworstell-isl joined #evergreen
07:34 BDorsey joined #evergreen
08:26 mantis1 joined #evergreen
08:40 mmorgan joined #evergreen
08:41 Dyrcona joined #evergreen
08:53 Dyrcona Oof!
08:54 Dyrcona Should have paid more attention. Got an error from apt missing keys... Well, duh. I deleted pgdg-keyring yesterday because I had to in order to upgrade postgresql-common. So, reinstalling pgdg-keyring this morning and all the other PostgreSQL packages are being removed.
08:55 Dyrcona Don't let it happen to you!
08:58 Dyrcona Fortunately "remove" is not "purge."
08:58 dguarrac joined #evergreen
09:35 Dyrcona joined #evergreen
09:36 BDorsey joined #evergreen
09:47 Dyrcona It also helps to remember that you laptop isn't configured for the office WiFi before you unplug it from the USB "dock" that provides its Ethernet. :)
09:50 Dyrcona csharp_: I have checked the explain output, but based on runtime, it looks Pg 11 is not affected by the change that causes search to be slow on Pg 14.
09:50 Dyrcona s/have/haven't/
09:53 berick joined #evergreen
09:55 Dyrcona Does anyone know if optimization settings have much effect on pg_restore? I suspect that they do. I'm considering just restoring the database on Pg 12 and Pg 13 before continuing with running further explains.
09:55 Dyrcona I suppose I could ask in #postgresql.
09:58 Dyrcona Looks like it is going to a typo and omission kind of day.
10:00 jeff heh.
10:00 csharp_ Dyrcona: thanks - I'm importing into PG 11 now
10:01 Dyrcona csharp_: Cool. I ran an explain on Pg 11 this morning. I usually configure the database, restart that Pg instance, then do the explain twice using the second result.
10:02 Dyrcona I'm restoring into Pg 12. That will probably not finish until sometime this evening.
10:03 jeff Dyrcona: are you asking if postgres settings affect the performance of pg_restore, or are you asking about something else?
10:04 Dyrcona jeff: That's exactly what I'm asking.
10:06 jeff There are a number of settings that you can tweak to make pg_restore complete faster. Most are "running with scissors" things like setting WAL to minimal, increasing checkpoint_segments, setting fsync=off (DANGER, Will Robinson!), etc. I've also seen recommendations to decrease shared_buffers so that you can increase work_mem to help with index building.
10:07 jeff Oh, also: turning off autovacuum is usually a good idea. You're probably planning to vacuum when the restore is complete anyway.
10:07 Dyrcona jeff: Thanks. I'll just leave things as they are.
10:08 Dyrcona I've got a file with settings that I think are optimized for this server. I move it from conf.d to conf.d for whatever version I'm testing. I restart those specific Pg instances, and then start the restore.
10:09 jeff I don't if there's an authoritative, up to date list of these recommendations. Not seeing it in a quick few searches.
10:09 Dyrcona I suppose I could do some "fancy" things to speed up the restore, but whatever. I was mainly thinking about restoring into a database with default settings versus one with better settings for the hardware.
10:10 jeff Oh, and of course -j, but you probably already had that in mind and weren't considering it as a "setting". :-)
10:10 Dyrcona Yeah. Pg performance has always felt like voodoo. I think it varies a lot by workload.
10:11 Dyrcona Yeah, I'm using -j 4. I sometimes use -j 6.
10:15 Dyrcona I'll try -j 6 on Pg 13 tomorrow. I think it does better with -j 6, but I was trying -j 4 in case disk is a bottleneck.
10:19 sleary joined #evergreen
10:20 rfrasur joined #evergreen
10:59 kworstell_isl joined #evergreen
11:09 BDorsey_ joined #evergreen
11:27 kworstell_isl joined #evergreen
11:51 kworstell_isl_ joined #evergreen
11:52 jihpringle joined #evergreen
11:52 stompro_home joined #evergreen
12:44 jvwoolf joined #evergreen
13:03 Dyrcona joined #evergreen
13:55 mantis1 joined #evergreen
14:00 Dyrcona csharp_: Looks like the trouble starts with Pg 12.
14:18 Dyrcona csharp_: As soon as I determine what change in Pg triggers these slowdowns, I'll open a Lp bug.
14:30 csharp_ Dyrcona++
14:31 Dyrcona Looking through the Pg 12 release notes, nothing jumps out at me.
14:37 csharp_ Dyrcona: are you looking at https://www.postgresql.org/d​ocs/12/release-12.html#id-1.11.6.18.5 ?
14:39 Dyrcona Yes, and the section before that.
14:41 csharp_ cool
14:43 csharp_ confirmed that the Big Ugly Query™ returns quickly on PG 11
14:44 Dyrcona :)
14:55 Dyrcona @band add Vaguely Quabbin
14:55 pinesol Dyrcona: Band 'Vaguely Quabbin' added to list
14:59 Dyrcona A couple of things look like they might have affected this, but nothing looks super solid: changes to add more languages to full text search, inlining of CTEs, and possibly an optimizer change to better handle partial indexes with is not null conditions.
15:04 Dyrcona @ana Vaguely Quabbin
15:04 pinesol Dyrcona: Bug qua enviably
15:19 csharp_ the optimizer change was what caught my eye
15:20 Dyrcona yeah. There's a lot of convolutedness going on at that point in the query.
15:24 Dyrcona I notice that the default schema does not appear to make an index on vis_attr_cache.
15:25 Dyrcona This makes me wonder where the index scans in my explain output are coming from.
15:25 sleary joined #evergreen
15:26 Dyrcona I wonder if adding a GIN index using to_tsvector() would speed things up?
15:27 Dyrcona I'll try that on Pg 13 tomorrow after I've run the explain a couple of times.
15:33 jeff are you talking about indexes on the asset.copy_vis_attr_cache table?
15:33 Dyrcona Yeah, and maybe biblio.record_entry.
15:40 jeff do either of you have explain output for the same query on 11 and >11 on an otherwise-similar dataset/evergreen version? I'll scroll back, but if you have them handy...
15:41 Dyrcona I have it from Pg 10 and Pg 14. I can add the Pg 11. Just a sec.
15:41 jeff And Dyrcona, did you see miker point out the other day that @@ isn't exclusive to tsvector types?
15:42 jeff <miker> csharp_ / Dyrcona (later): @@ is not just for tsvectors. see: https://www.postgresql.org/docs/14/intarray.html
15:42 jeff <miker> IOW, it's not doing tsearch stuff there, it's doing query_int stuff
15:42 jeff if you have 10 I'd be interested in that also, especially if you are about to have 10, 11, and 14 all with similar query on similar data.
15:43 jeff (I'm curious to look at it later, and can make zero promises of any results, so don't go out of your way on my behalf...)
15:43 Dyrcona The pg 10 and pg14 explains are up at depesz's site already.
15:44 Dyrcona I wonder if adding an index would help, still.
15:48 jeffdavis I'm looking at https://explain.depesz.com/s/EBTe - I don't really understand this stuff, but why is the vis_attr_vector filter on copy_vis_attr_cache returning 2 million rows?
15:48 Dyrcona Pg 10: https://explain.depesz.com/s/pCZz | Pg 14: https://explain.depesz.com/s/Ay5y | Pg 11: https://explain.depesz.com/s/EwZJ
15:49 Dyrcona jeffdavis: It's a seq scan on a table.
15:49 Dyrcona Looking at my explains again, I see why the index scan happens. It's not on the field that I expected.
15:51 Dyrcona Pg 11 explain looks really nice compared to the other two. :)
15:54 jeffdavis Right, but why seq scan over all the rows in the table when there's another condition that would eliminate almost all those rows?
15:55 Dyrcona jeff: I didn't see miker's comment, but I did have an inkling of a memory of @@ being useful for more than just ts queries.
15:59 * Dyrcona blames the filter condition and thinks an index might help.
16:00 Dyrcona Notice that on Pg 14 with the Index Scan, it's still looking at over 2 million rows, while on Pg 11 it's only looking at 21 rows, which is how many copies we have.
16:08 * miker goes to look at those
16:11 miker jeffdavis: in your link, you're asking about row 60?
16:19 miker anyway, it's certainly making bad choices about plan structure int pg 14
16:25 miker Dyrcona: do you have 250k live bibs and 1.98m deleted bibs?
16:27 jeffdavis (yes, I was asking about row 60)
16:30 miker jeffdavis: is that your EXPLAIN or was that that Dyrcona's?
16:30 jeffdavis That was one of Dyrcona's.
16:31 miker kk
16:31 Dyrcona miker: I think that was csharp_'s explain.
16:32 miker Dyrcona: re my question to you, I was looking at your pg 14 link, Ay5y
16:32 Dyrcona Never mind. I'm not sure whose EBTe is.
16:32 Dyrcona I wouldn't be surprised if we have 1.98 million deleted bibs.
16:32 Dyrcona I'm about to take off for the day.
16:32 miker but, only 250k non-deleted?
16:33 Dyrcona I can get a count.
16:34 miker ah, well, you might try an expression index on id where not deleted, since that's the filter on the seq scan under the hash join at line 48. (I wonder if you have "retain deleted" turned on...?)
16:41 Dyrcona I canceled my query it was running on a non-optimized Pg instance while a pg_restore is going on. I have  a 110+ mile drive ahead of me, so I'm out.
17:19 mmorgan left #evergreen
17:29 jvwoolf left #evergreen
20:16 stompro_home joined #evergreen
20:18 stompro__ joined #evergreen
22:26 StomproJ joined #evergreen
22:26 stompro_home joined #evergreen
22:27 jeffdavis_ joined #evergreen
22:28 Christineb_ joined #evergreen

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat