IRC log for #evergreen, 2022-12-08

All times shown according to the server's local time.

Time	Nick	Message
06:48		JBoyer joined #evergreen
07:34		kworstell-isl joined #evergreen
07:34		BDorsey joined #evergreen
08:26		mantis1 joined #evergreen
08:40		mmorgan joined #evergreen
08:41		Dyrcona joined #evergreen
08:53	Dyrcona	Oof!
08:54	Dyrcona	Should have paid more attention. Got an error from apt missing keys... Well, duh. I deleted pgdg-keyring yesterday because I had to in order to upgrade postgresql-common. So, reinstalling pgdg-keyring this morning and all the other PostgreSQL packages are being removed.
08:55	Dyrcona	Don't let it happen to you!
08:58	Dyrcona	Fortunately "remove" is not "purge."
08:58		dguarrac joined #evergreen
09:35		Dyrcona joined #evergreen
09:36		BDorsey joined #evergreen
09:47	Dyrcona	It also helps to remember that you laptop isn't configured for the office WiFi before you unplug it from the USB "dock" that provides its Ethernet. :)
09:50	Dyrcona	csharp_: I have checked the explain output, but based on runtime, it looks Pg 11 is not affected by the change that causes search to be slow on Pg 14.
09:50	Dyrcona	s/have/haven't/
09:53		berick joined #evergreen
09:55	Dyrcona	Does anyone know if optimization settings have much effect on pg_restore? I suspect that they do. I'm considering just restoring the database on Pg 12 and Pg 13 before continuing with running further explains.
09:55	Dyrcona	I suppose I could ask in #postgresql.
09:58	Dyrcona	Looks like it is going to a typo and omission kind of day.
10:00	jeff	heh.
10:00	csharp_	Dyrcona: thanks - I'm importing into PG 11 now
10:01	Dyrcona	csharp_: Cool. I ran an explain on Pg 11 this morning. I usually configure the database, restart that Pg instance, then do the explain twice using the second result.
10:02	Dyrcona	I'm restoring into Pg 12. That will probably not finish until sometime this evening.
10:03	jeff	Dyrcona: are you asking if postgres settings affect the performance of pg_restore, or are you asking about something else?
10:04	Dyrcona	jeff: That's exactly what I'm asking.
10:06	jeff	There are a number of settings that you can tweak to make pg_restore complete faster. Most are "running with scissors" things like setting WAL to minimal, increasing checkpoint_segments, setting fsync=off (DANGER, Will Robinson!), etc. I've also seen recommendations to decrease shared_buffers so that you can increase work_mem to help with index building.
10:07	jeff	Oh, also: turning off autovacuum is usually a good idea. You're probably planning to vacuum when the restore is complete anyway.
10:07	Dyrcona	jeff: Thanks. I'll just leave things as they are.
10:08	Dyrcona	I've got a file with settings that I think are optimized for this server. I move it from conf.d to conf.d for whatever version I'm testing. I restart those specific Pg instances, and then start the restore.
10:09	jeff	I don't if there's an authoritative, up to date list of these recommendations. Not seeing it in a quick few searches.
10:09	Dyrcona	I suppose I could do some "fancy" things to speed up the restore, but whatever. I was mainly thinking about restoring into a database with default settings versus one with better settings for the hardware.
10:10	jeff	Oh, and of course -j, but you probably already had that in mind and weren't considering it as a "setting". :-)
10:10	Dyrcona	Yeah. Pg performance has always felt like voodoo. I think it varies a lot by workload.
10:11	Dyrcona	Yeah, I'm using -j 4. I sometimes use -j 6.
10:15	Dyrcona	I'll try -j 6 on Pg 13 tomorrow. I think it does better with -j 6, but I was trying -j 4 in case disk is a bottleneck.
10:19		sleary joined #evergreen
10:20		rfrasur joined #evergreen
10:59		kworstell_isl joined #evergreen
11:09		BDorsey_ joined #evergreen
11:27		kworstell_isl joined #evergreen
11:51		kworstell_isl_ joined #evergreen
11:52		jihpringle joined #evergreen
11:52		stompro_home joined #evergreen
12:44		jvwoolf joined #evergreen
13:03		Dyrcona joined #evergreen
13:55		mantis1 joined #evergreen
14:00	Dyrcona	csharp_: Looks like the trouble starts with Pg 12.
14:18	Dyrcona	csharp_: As soon as I determine what change in Pg triggers these slowdowns, I'll open a Lp bug.
14:30	csharp_	Dyrcona++
14:31	Dyrcona	Looking through the Pg 12 release notes, nothing jumps out at me.
14:37	csharp_	Dyrcona: are you looking at https://www.postgresql.org/docs/12/release-12.html#id-1.11.6.18.5 ?
14:39	Dyrcona	Yes, and the section before that.
14:41	csharp_	cool
14:43	csharp_	confirmed that the Big Ugly Query™ returns quickly on PG 11
14:44	Dyrcona	:)
14:55	Dyrcona	@band add Vaguely Quabbin
14:55	pinesol	Dyrcona: Band 'Vaguely Quabbin' added to list
14:59	Dyrcona	A couple of things look like they might have affected this, but nothing looks super solid: changes to add more languages to full text search, inlining of CTEs, and possibly an optimizer change to better handle partial indexes with is not null conditions.
15:04	Dyrcona	@ana Vaguely Quabbin
15:04	pinesol	Dyrcona: Bug qua enviably
15:19	csharp_	the optimizer change was what caught my eye
15:20	Dyrcona	yeah. There's a lot of convolutedness going on at that point in the query.
15:24	Dyrcona	I notice that the default schema does not appear to make an index on vis_attr_cache.
15:25	Dyrcona	This makes me wonder where the index scans in my explain output are coming from.
15:25		sleary joined #evergreen
15:26	Dyrcona	I wonder if adding a GIN index using to_tsvector() would speed things up?
15:27	Dyrcona	I'll try that on Pg 13 tomorrow after I've run the explain a couple of times.
15:33	jeff	are you talking about indexes on the asset.copy_vis_attr_cache table?
15:33	Dyrcona	Yeah, and maybe biblio.record_entry.
15:40	jeff	do either of you have explain output for the same query on 11 and >11 on an otherwise-similar dataset/evergreen version? I'll scroll back, but if you have them handy...
15:41	Dyrcona	I have it from Pg 10 and Pg 14. I can add the Pg 11. Just a sec.
15:41	jeff	And Dyrcona, did you see miker point out the other day that @@ isn't exclusive to tsvector types?
15:42	jeff	<miker> csharp_ / Dyrcona (later): @@ is not just for tsvectors. see: https://www.postgresql.org/docs/14/intarray.html
15:42	jeff	<miker> IOW, it's not doing tsearch stuff there, it's doing query_int stuff
15:42	jeff	if you have 10 I'd be interested in that also, especially if you are about to have 10, 11, and 14 all with similar query on similar data.
15:43	jeff	(I'm curious to look at it later, and can make zero promises of any results, so don't go out of your way on my behalf...)
15:43	Dyrcona	The pg 10 and pg14 explains are up at depesz's site already.
15:44	Dyrcona	I wonder if adding an index would help, still.
15:48	jeffdavis	I'm looking at https://explain.depesz.com/s/EBTe - I don't really understand this stuff, but why is the vis_attr_vector filter on copy_vis_attr_cache returning 2 million rows?
15:48	Dyrcona	Pg 10: https://explain.depesz.com/s/pCZz \| Pg 14: https://explain.depesz.com/s/Ay5y \| Pg 11: https://explain.depesz.com/s/EwZJ
15:49	Dyrcona	jeffdavis: It's a seq scan on a table.
15:49	Dyrcona	Looking at my explains again, I see why the index scan happens. It's not on the field that I expected.
15:51	Dyrcona	Pg 11 explain looks really nice compared to the other two. :)
15:54	jeffdavis	Right, but why seq scan over all the rows in the table when there's another condition that would eliminate almost all those rows?
15:55	Dyrcona	jeff: I didn't see miker's comment, but I did have an inkling of a memory of @@ being useful for more than just ts queries.
15:59	* Dyrcona	blames the filter condition and thinks an index might help.
16:00	Dyrcona	Notice that on Pg 14 with the Index Scan, it's still looking at over 2 million rows, while on Pg 11 it's only looking at 21 rows, which is how many copies we have.
16:08	* miker	goes to look at those
16:11	miker	jeffdavis: in your link, you're asking about row 60?
16:19	miker	anyway, it's certainly making bad choices about plan structure int pg 14
16:25	miker	Dyrcona: do you have 250k live bibs and 1.98m deleted bibs?
16:27	jeffdavis	(yes, I was asking about row 60)
16:30	miker	jeffdavis: is that your EXPLAIN or was that that Dyrcona's?
16:30	jeffdavis	That was one of Dyrcona's.
16:31	miker	kk
16:31	Dyrcona	miker: I think that was csharp_'s explain.
16:32	miker	Dyrcona: re my question to you, I was looking at your pg 14 link, Ay5y
16:32	Dyrcona	Never mind. I'm not sure whose EBTe is.
16:32	Dyrcona	I wouldn't be surprised if we have 1.98 million deleted bibs.
16:32	Dyrcona	I'm about to take off for the day.
16:32	miker	but, only 250k non-deleted?
16:33	Dyrcona	I can get a count.
16:34	miker	ah, well, you might try an expression index on id where not deleted, since that's the filter on the seq scan under the hash join at line 48. (I wonder if you have "retain deleted" turned on...?)
16:41	Dyrcona	I canceled my query it was running on a non-optimized Pg instance while a pg_restore is going on. I have a 110+ mile drive ahead of me, so I'm out.
17:19		mmorgan left #evergreen
17:29		jvwoolf left #evergreen
20:16		stompro_home joined #evergreen
20:18		stompro__ joined #evergreen
22:26		StomproJ joined #evergreen
22:26		stompro_home joined #evergreen
22:27		jeffdavis_ joined #evergreen
22:28		Christineb_ joined #evergreen