IRC log for #evergreen, 2025-04-10

All times shown according to the server's local time.

Time	Nick	Message
08:30		mmorgan joined #evergreen
08:47		dguarrac joined #evergreen
09:32		Rogan_ joined #evergreen
09:35		tsadok joined #evergreen
09:37		Dyrcona joined #evergreen
10:19	Dyrcona	jeffdavis: I can't reproduce Lp 2106679 with Evergreen 3.14.4 and Pg 16.8. I'm waiting to see if John Amundson is going to comment on the bug before I do.
10:19	pinesol	Launchpad bug 2106679 in Evergreen "Items Out is very slow to load" [Undecided,New] https://launchpad.net/bugs/2106679
10:39	berick	huh neat just noticed the cursor (left/right) in vim/terminal moves with the direction of the source language. e.g. עִברִית
10:40	csharp_	jeffdavis: maybe a missing index somewhere?
10:40	csharp_	(we're on 3.14.3 and haven't seen the issue that I'm aware of)
11:18		Christineb joined #evergreen
11:22		jihpringle joined #evergreen
11:23		jvwoolf joined #evergreen
12:11	eeevil	jeffdavis: LP will let you know soon, but I pushed a (entirely untested, didn't even attempt to compile it) patch as food for thought re "slow items out" vs cursors
12:15		mantis joined #evergreen
12:16	mantis	we're looking into a reingest project; does anyone have a recommendation on where to start for reading purposes? I remember there was something in past release notes but can't remember the version number
12:33	Bmagic	mantis: I assume you've seen this: https://docs.evergreen-ils.org/docs/latest/development/support_scripts.html#pingest_pl
12:34	mantis	ah yes! I did remember viewing this doc, but wasn't sure where it was. Thank you!
12:34	mantis	Bmagic++
12:35	Bmagic	let us know if you have any questions!
12:37		jvwoolf joined #evergreen
12:38	mantis	Bmagic: definitely for sure
12:39	mantis	I would like to start these on test servers during off hours next week; do you have any best practice tips? We haven't been able to perform a successful parallel ingest before
12:39	mantis	I think we did try but there was issue with either storage or just having too many records and a timeout happens
12:39	mantis	not 100% sure on that but it definitely had something to do with the size of our collection
12:40	Bmagic	what's the total number of bibs?
12:44	mantis	we're somewhere in the 600k area - at least that's what I'm seeing when running counts on IDs
12:45	Bmagic	the main issue we've run into is running the reingest at the same time as a backup. That usually results in a full disk situation. So if your backups are running at night, you want your reingest to finish (or a portion thereof) before the backup starts
12:49	mantis	Ok I'll keep everyone abreast of the project! Thank you!
12:49	mantis	Bmagic++
12:49	Dyrcona	mantis: I run it on 1.8 million bibs all the time.
12:49	mantis	Dyrcona: how often?
12:50	Dyrcona	At least monthly on my test systems.
12:50	Dyrcona	it is generally only necessary after database upgrades.
12:51	Dyrcona	Well, upgrades that affect search.
12:52	mantis	what's the difference between parallel and queued? parallel does it in concurrent batches?
12:53	Dyrcona	queued is newer and supposed to supplant parallel. parallel forks multiple processes. The browse indexes can't be done in parallel (or couldn't at the time of implementation of the script), so it's done as a single batch.
12:54	mantis	is there a preference on which one to do first?
12:54	Dyrcona	queued ingest is also geared toward day to day processing more, IIRC. parallel is for when you wan to reingest the whole thing or significant parts thereof.
12:55	Dyrcona	I've never manually kicked off a queued ingest if that helps, but I'm biased. :)
12:55	Bmagic	that said: I've used the queued ingest tool to reingest the whole database
12:56		jihpringle joined #evergreen
12:57	Dyrcona	For ingesting the whole database, it's more of either/or. I'd probably recommend a queued ingest. If you have symspell enabled, it play more nicely with it, I think.
12:58	Bmagic	the key is understanding how long the server needs to reingest X number of records. And time it so that it will be done before the nightly backup begins. After you know roughly how long it takes to reingest 100k bibs for example, you can estimate how long it will take. And if the estimation will require the reingest to continue into the nightly backup, you'll want to limit the number of bibs it does per execution
12:59	Bmagic	let's say it takes 10 hours to reingest 200k bibs. And you have 600k. You'll likely want to break that up into 200k chunks, per day for 3 days
13:00	Dyrcona	If it takes that long, disable symspell. :P
13:00	Bmagic	so you can kick it off in the morning (after the backup has completed), and allow it to finish the reingest of the 200k bibs. It should finish before the nightly backup. Rinse and repeat each morning.
13:01	Dyrcona	It comes down to why you want to do the ingest. Short of an upgrade, you normally wouldn't need to. If you add a new search index, then, yeah, reingest the affected records.
13:02	mantis	I think the original reason why we wanted to do this was because of an issue with search index terms within our KPAC that got reported
13:02	mantis	would searching also become faster in terms of performance?
13:03	Dyrcona	It will have little if any impact on search performance.
13:04	Dyrcona	The purpose of ingest is to make sure that the search indexes are up to date with your records. This should be taken care of normally, but under some circumstances, they get out of whack.
13:04	mantis	why do you run ingests monthly?
13:05	Dyrcona	I copy a dump of production data to a local server. I load that dump into a database. I make a copy of that database which is updated to a more recent version of Evergreen. That gets dumped again, and reloaded as yet a third database.
13:06	Dyrcona	This third database is connected to a test vm running that release of Evergreen. I run a pingest on that database to make sure that search works.
13:07	Dyrcona	I find it is often more convenient to just do a pingest of the whole thing rather than do the SQL bits included in the DB upgrades' comments.
13:08	Dyrcona	In production, we pretty much never do an ingest, unless we added a new search index (rare) or just did a version upgrade (probably less rare).
13:08	mantis	ah I see
13:08	mantis	so it's just a preference
13:09	Dyrcona	it might fix your KPAC issues depending on the problem. I don't know enough about the particulars to really say.
13:09	Bmagic	mantis: he's reingesting on a test machine. Which can be done willy nilly without concern about uptime and production nightly backup interference.
13:10	Bmagic	It's a good idea to get familiar with the procedure on a test machine though!
13:10	Bmagic	You can use the expierence that you gain from running the procedure on a test machine to see how long it takes, which should translate roughly to the time it will take on production (if your test machine is close in size and shape)
13:10	Dyrcona	If the Evergreen upgrade affects search indexes, then doing some sort of ingest is pretty much mandatory, or search will not reflect the changes in the upgrade. That's why I do the ingest on the database that has been upgraded to a more recent version of Evergreen than the original data.
13:12	Dyrcona	Not knowing the history of your database and upgrades, I can't say if an ingest will solve your KPAC search issues or not.
13:13	mantis	I can always give it a try anyway!
13:13	mantis	thanks to you both
13:13	Bmagic	one way to know: copy your production database to a test machine, ensure that you're having the same issue on your test machine as you do on production. Then run your reingest on the test machine, and see if it makes a difference
13:14		redavis joined #evergreen
13:17	Dyrcona	You should be able to do either type of ingest while users are using Evergreen. They won't notice for the most part.
13:17	Dyrcona	We used to kick a pingest off after the upgrade and start services so users could use Evergreen while the ingest was chugging away.
15:14		jihpringle joined #evergreen
15:58		mantis left #evergreen
16:24		jvwoolf joined #evergreen
17:08		mmorgan left #evergreen
17:36	jeffdavis	eeevil++
17:36	jeffdavis	that branch for bug 2106679 fixes our Items Out issue, not sure about side effects yet though
17:36	pinesol	Launchpad bug 2106679 in Evergreen "Items Out is very slow to load" [Undecided,New] https://launchpad.net/bugs/2106679
17:45		jihpringle joined #evergreen
20:38	eeevil	for the logs, if Dyrcona and mantis look later, there's a branch that addresses the browse index serialization issue, and (barring bugs) will allow queued ingest to /actually/ do parallel ingest, and it includes a pingest.pl update to teach it the browse version of the parallelizing trick it uses for symspell.
20:41		jvwoolf joined #evergreen
22:05		pinesol joined #evergreen