Evergreen ILS Website

IRC log for #evergreen, 2025-04-10

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
08:30 mmorgan joined #evergreen
08:47 dguarrac joined #evergreen
09:32 Rogan_ joined #evergreen
09:35 tsadok joined #evergreen
09:37 Dyrcona joined #evergreen
10:19 Dyrcona jeffdavis: I can't reproduce Lp 2106679 with Evergreen 3.14.4 and Pg 16.8. I'm waiting to see if John Amundson is going to comment on the bug before I do.
10:19 pinesol Launchpad bug 2106679 in Evergreen "Items Out is very slow to load" [Undecided,New] https://launchpad.net/bugs/2106679
10:39 berick huh neat just noticed the cursor (left/right) in vim/terminal moves with the direction of the source language.  e.g. עִברִית
10:40 csharp_ jeffdavis: maybe a missing index somewhere?
10:40 csharp_ (we're on 3.14.3 and haven't seen the issue that I'm aware of)
11:18 Christineb joined #evergreen
11:22 jihpringle joined #evergreen
11:23 jvwoolf joined #evergreen
12:11 eeevil jeffdavis: LP will let you know soon, but I pushed a (entirely untested, didn't even attempt to compile it) patch as food for thought re "slow items out" vs cursors
12:15 mantis joined #evergreen
12:16 mantis we're looking into a reingest project; does anyone have a recommendation on where to start for reading purposes?  I remember there was something in past release notes but can't remember the version number
12:33 Bmagic mantis: I assume you've seen this: https://docs.evergreen-ils.org/docs/latest/​development/support_scripts.html#pingest_pl
12:34 mantis ah yes!  I did remember viewing this doc, but wasn't sure where it was.  Thank you!
12:34 mantis Bmagic++
12:35 Bmagic let us know if you have any questions!
12:37 jvwoolf joined #evergreen
12:38 mantis Bmagic: definitely for sure
12:39 mantis I would like to start these on test servers during off hours next week; do you have any best practice tips?  We haven't been able to perform a successful parallel ingest before
12:39 mantis I think we did try but there was issue with either storage or just having too many records and a timeout happens
12:39 mantis not 100% sure on that but it definitely had something to do with the size of our collection
12:40 Bmagic what's the total number of bibs?
12:44 mantis we're somewhere in the 600k area - at least that's what I'm seeing when running counts on IDs
12:45 Bmagic the main issue we've run into is running the reingest at the same time as a backup. That usually results in a full disk situation. So if your backups are running at night, you want your reingest to finish (or a portion thereof) before the backup starts
12:49 mantis Ok I'll keep everyone abreast of the project!  Thank you!
12:49 mantis Bmagic++
12:49 Dyrcona mantis: I run it on 1.8 million bibs all the time.
12:49 mantis Dyrcona: how often?
12:50 Dyrcona At least monthly on my test systems.
12:50 Dyrcona it is generally only necessary after database upgrades.
12:51 Dyrcona Well, upgrades that affect search.
12:52 mantis what's the difference between parallel and queued?  parallel does it in concurrent batches?
12:53 Dyrcona queued is newer and supposed to supplant parallel. parallel forks multiple processes. The browse indexes can't be done in parallel (or couldn't at the time of implementation of the script), so it's done as a single batch.
12:54 mantis is there a preference on which one to do first?
12:54 Dyrcona queued ingest is also geared toward day to day processing more, IIRC. parallel is for when you wan to reingest the whole thing or significant parts thereof.
12:55 Dyrcona I've never manually kicked off a queued ingest if that helps, but I'm biased. :)
12:55 Bmagic that said: I've used the queued ingest tool to reingest the whole database
12:56 jihpringle joined #evergreen
12:57 Dyrcona For ingesting the whole database, it's more of either/or. I'd probably recommend a queued ingest. If you have symspell enabled, it play more nicely with it, I think.
12:58 Bmagic the key is understanding *how long* the server needs to reingest X number of records. And time it so that it will be done before the nightly backup begins. After you know roughly how long it takes to reingest 100k bibs for example, you can estimate how long it will take. And if the estimation will require the reingest to continue into the nightly backup, you'll want to limit the number of bibs it does per execution
12:59 Bmagic let's say it takes 10 hours to reingest 200k bibs. And you have 600k. You'll likely want to break that up into 200k chunks, per day for 3 days
13:00 Dyrcona If it takes that long, disable symspell. :P
13:00 Bmagic so you can kick it off in the morning (after the backup has completed), and allow it to finish the reingest of the 200k bibs. It should finish before the nightly backup. Rinse and repeat each morning.
13:01 Dyrcona It comes down to why you want to do the ingest. Short of an upgrade, you normally wouldn't need to. If you add a new search index, then, yeah, reingest the affected records.
13:02 mantis I think the original reason why we wanted to do this was because of an issue with search index terms within our KPAC that got reported
13:02 mantis would searching also become faster in terms of performance?
13:03 Dyrcona It will have little if any impact on search performance.
13:04 Dyrcona The purpose of ingest is to make sure that the search indexes are up to date with your records. This should be taken care of normally, but under some circumstances, they get out of whack.
13:04 mantis why do you run ingests monthly?
13:05 Dyrcona I copy a dump of production data to a local server. I load that dump into a database. I make a copy of that database which is updated to a more recent version of Evergreen. That gets dumped again, and reloaded as yet a third database.
13:06 Dyrcona This third database is connected to a test vm running that release of Evergreen. I run a pingest on that database to make sure that search works.
13:07 Dyrcona I find it is often more convenient to just do a pingest of the whole thing rather than do the SQL bits included in the DB upgrades' comments.
13:08 Dyrcona In production, we pretty much never do an ingest, unless we added a new search index (rare) or just did a version upgrade (probably less rare).
13:08 mantis ah I see
13:08 mantis so it's just a preference
13:09 Dyrcona it might fix your KPAC issues depending on the problem. I don't know enough about the particulars to really say.
13:09 Bmagic mantis: he's reingesting on a test machine. Which can be done willy nilly without concern about uptime and production nightly backup interference.
13:10 Bmagic It's a good idea to get familiar with the procedure on a test machine though!
13:10 Bmagic You can use the expierence that you gain from running the procedure on a test machine to see how long it takes, which should translate roughly to the time it will take on production (if your test machine is close in size and shape)
13:10 Dyrcona If the Evergreen upgrade affects search indexes, then doing some sort of ingest is pretty much mandatory, or search will not reflect the changes in the upgrade. That's why I do the ingest on the database that has been upgraded to a more recent version of Evergreen than the original data.
13:12 Dyrcona Not knowing the history of your database and upgrades, I can't say if an ingest will solve your KPAC search issues or not.
13:13 mantis I can always give it a try anyway!
13:13 mantis thanks to you both
13:13 Bmagic one way to know: copy your production database to a test machine, ensure that you're having the same issue on your test machine as you do on production. Then run your reingest on the test machine, and see if it makes a difference
13:14 redavis joined #evergreen
13:17 Dyrcona You should be able to do either type of ingest while users are using Evergreen. They won't notice for the most part.
13:17 Dyrcona We used to kick a pingest off after the upgrade and start services so users could use Evergreen while the ingest was chugging away.
15:14 jihpringle joined #evergreen
15:58 mantis left #evergreen
16:24 jvwoolf joined #evergreen
17:08 mmorgan left #evergreen
17:36 jeffdavis eeevil++
17:36 jeffdavis that branch for bug 2106679 fixes our Items Out issue, not sure about side effects yet though
17:36 pinesol Launchpad bug 2106679 in Evergreen "Items Out is very slow to load" [Undecided,New] https://launchpad.net/bugs/2106679
17:45 jihpringle joined #evergreen
20:38 eeevil for the logs, if Dyrcona and mantis look later, there's a branch that addresses the browse index serialization issue, and (barring bugs) will allow queued ingest to /actually/ do parallel ingest, and it includes a pingest.pl update to teach it the browse version of the parallelizing trick it uses for symspell.
20:41 jvwoolf joined #evergreen
22:05 pinesol joined #evergreen

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat