| Time |
Nick |
Message |
| 08:30 |
|
mmorgan joined #evergreen |
| 08:47 |
|
dguarrac joined #evergreen |
| 09:32 |
|
Rogan_ joined #evergreen |
| 09:35 |
|
tsadok joined #evergreen |
| 09:37 |
|
Dyrcona joined #evergreen |
| 10:19 |
Dyrcona |
jeffdavis: I can't reproduce Lp 2106679 with Evergreen 3.14.4 and Pg 16.8. I'm waiting to see if John Amundson is going to comment on the bug before I do. |
| 10:19 |
pinesol |
Launchpad bug 2106679 in Evergreen "Items Out is very slow to load" [Undecided,New] https://launchpad.net/bugs/2106679 |
| 10:39 |
berick |
huh neat just noticed the cursor (left/right) in vim/terminal moves with the direction of the source language. e.g. עִברִית |
| 10:40 |
csharp_ |
jeffdavis: maybe a missing index somewhere? |
| 10:40 |
csharp_ |
(we're on 3.14.3 and haven't seen the issue that I'm aware of) |
| 11:18 |
|
Christineb joined #evergreen |
| 11:22 |
|
jihpringle joined #evergreen |
| 11:23 |
|
jvwoolf joined #evergreen |
| 12:11 |
eeevil |
jeffdavis: LP will let you know soon, but I pushed a (entirely untested, didn't even attempt to compile it) patch as food for thought re "slow items out" vs cursors |
| 12:15 |
|
mantis joined #evergreen |
| 12:16 |
mantis |
we're looking into a reingest project; does anyone have a recommendation on where to start for reading purposes? I remember there was something in past release notes but can't remember the version number |
| 12:33 |
Bmagic |
mantis: I assume you've seen this: https://docs.evergreen-ils.org/docs/latest/development/support_scripts.html#pingest_pl |
| 12:34 |
mantis |
ah yes! I did remember viewing this doc, but wasn't sure where it was. Thank you! |
| 12:34 |
mantis |
Bmagic++ |
| 12:35 |
Bmagic |
let us know if you have any questions! |
| 12:37 |
|
jvwoolf joined #evergreen |
| 12:38 |
mantis |
Bmagic: definitely for sure |
| 12:39 |
mantis |
I would like to start these on test servers during off hours next week; do you have any best practice tips? We haven't been able to perform a successful parallel ingest before |
| 12:39 |
mantis |
I think we did try but there was issue with either storage or just having too many records and a timeout happens |
| 12:39 |
mantis |
not 100% sure on that but it definitely had something to do with the size of our collection |
| 12:40 |
Bmagic |
what's the total number of bibs? |
| 12:44 |
mantis |
we're somewhere in the 600k area - at least that's what I'm seeing when running counts on IDs |
| 12:45 |
Bmagic |
the main issue we've run into is running the reingest at the same time as a backup. That usually results in a full disk situation. So if your backups are running at night, you want your reingest to finish (or a portion thereof) before the backup starts |
| 12:49 |
mantis |
Ok I'll keep everyone abreast of the project! Thank you! |
| 12:49 |
mantis |
Bmagic++ |
| 12:49 |
Dyrcona |
mantis: I run it on 1.8 million bibs all the time. |
| 12:49 |
mantis |
Dyrcona: how often? |
| 12:50 |
Dyrcona |
At least monthly on my test systems. |
| 12:50 |
Dyrcona |
it is generally only necessary after database upgrades. |
| 12:51 |
Dyrcona |
Well, upgrades that affect search. |
| 12:52 |
mantis |
what's the difference between parallel and queued? parallel does it in concurrent batches? |
| 12:53 |
Dyrcona |
queued is newer and supposed to supplant parallel. parallel forks multiple processes. The browse indexes can't be done in parallel (or couldn't at the time of implementation of the script), so it's done as a single batch. |
| 12:54 |
mantis |
is there a preference on which one to do first? |
| 12:54 |
Dyrcona |
queued ingest is also geared toward day to day processing more, IIRC. parallel is for when you wan to reingest the whole thing or significant parts thereof. |
| 12:55 |
Dyrcona |
I've never manually kicked off a queued ingest if that helps, but I'm biased. :) |
| 12:55 |
Bmagic |
that said: I've used the queued ingest tool to reingest the whole database |
| 12:56 |
|
jihpringle joined #evergreen |
| 12:57 |
Dyrcona |
For ingesting the whole database, it's more of either/or. I'd probably recommend a queued ingest. If you have symspell enabled, it play more nicely with it, I think. |
| 12:58 |
Bmagic |
the key is understanding *how long* the server needs to reingest X number of records. And time it so that it will be done before the nightly backup begins. After you know roughly how long it takes to reingest 100k bibs for example, you can estimate how long it will take. And if the estimation will require the reingest to continue into the nightly backup, you'll want to limit the number of bibs it does per execution |
| 12:59 |
Bmagic |
let's say it takes 10 hours to reingest 200k bibs. And you have 600k. You'll likely want to break that up into 200k chunks, per day for 3 days |
| 13:00 |
Dyrcona |
If it takes that long, disable symspell. :P |
| 13:00 |
Bmagic |
so you can kick it off in the morning (after the backup has completed), and allow it to finish the reingest of the 200k bibs. It should finish before the nightly backup. Rinse and repeat each morning. |
| 13:01 |
Dyrcona |
It comes down to why you want to do the ingest. Short of an upgrade, you normally wouldn't need to. If you add a new search index, then, yeah, reingest the affected records. |
| 13:02 |
mantis |
I think the original reason why we wanted to do this was because of an issue with search index terms within our KPAC that got reported |
| 13:02 |
mantis |
would searching also become faster in terms of performance? |
| 13:03 |
Dyrcona |
It will have little if any impact on search performance. |
| 13:04 |
Dyrcona |
The purpose of ingest is to make sure that the search indexes are up to date with your records. This should be taken care of normally, but under some circumstances, they get out of whack. |
| 13:04 |
mantis |
why do you run ingests monthly? |
| 13:05 |
Dyrcona |
I copy a dump of production data to a local server. I load that dump into a database. I make a copy of that database which is updated to a more recent version of Evergreen. That gets dumped again, and reloaded as yet a third database. |
| 13:06 |
Dyrcona |
This third database is connected to a test vm running that release of Evergreen. I run a pingest on that database to make sure that search works. |
| 13:07 |
Dyrcona |
I find it is often more convenient to just do a pingest of the whole thing rather than do the SQL bits included in the DB upgrades' comments. |
| 13:08 |
Dyrcona |
In production, we pretty much never do an ingest, unless we added a new search index (rare) or just did a version upgrade (probably less rare). |
| 13:08 |
mantis |
ah I see |
| 13:08 |
mantis |
so it's just a preference |
| 13:09 |
Dyrcona |
it might fix your KPAC issues depending on the problem. I don't know enough about the particulars to really say. |
| 13:09 |
Bmagic |
mantis: he's reingesting on a test machine. Which can be done willy nilly without concern about uptime and production nightly backup interference. |
| 13:10 |
Bmagic |
It's a good idea to get familiar with the procedure on a test machine though! |
| 13:10 |
Bmagic |
You can use the expierence that you gain from running the procedure on a test machine to see how long it takes, which should translate roughly to the time it will take on production (if your test machine is close in size and shape) |
| 13:10 |
Dyrcona |
If the Evergreen upgrade affects search indexes, then doing some sort of ingest is pretty much mandatory, or search will not reflect the changes in the upgrade. That's why I do the ingest on the database that has been upgraded to a more recent version of Evergreen than the original data. |
| 13:12 |
Dyrcona |
Not knowing the history of your database and upgrades, I can't say if an ingest will solve your KPAC search issues or not. |
| 13:13 |
mantis |
I can always give it a try anyway! |
| 13:13 |
mantis |
thanks to you both |
| 13:13 |
Bmagic |
one way to know: copy your production database to a test machine, ensure that you're having the same issue on your test machine as you do on production. Then run your reingest on the test machine, and see if it makes a difference |
| 13:14 |
|
redavis joined #evergreen |
| 13:17 |
Dyrcona |
You should be able to do either type of ingest while users are using Evergreen. They won't notice for the most part. |
| 13:17 |
Dyrcona |
We used to kick a pingest off after the upgrade and start services so users could use Evergreen while the ingest was chugging away. |
| 15:14 |
|
jihpringle joined #evergreen |
| 15:58 |
|
mantis left #evergreen |
| 16:24 |
|
jvwoolf joined #evergreen |
| 17:08 |
|
mmorgan left #evergreen |
| 17:36 |
jeffdavis |
eeevil++ |
| 17:36 |
jeffdavis |
that branch for bug 2106679 fixes our Items Out issue, not sure about side effects yet though |
| 17:36 |
pinesol |
Launchpad bug 2106679 in Evergreen "Items Out is very slow to load" [Undecided,New] https://launchpad.net/bugs/2106679 |
| 17:45 |
|
jihpringle joined #evergreen |
| 20:38 |
eeevil |
for the logs, if Dyrcona and mantis look later, there's a branch that addresses the browse index serialization issue, and (barring bugs) will allow queued ingest to /actually/ do parallel ingest, and it includes a pingest.pl update to teach it the browse version of the parallelizing trick it uses for symspell. |
| 20:41 |
|
jvwoolf joined #evergreen |
| 22:05 |
|
pinesol joined #evergreen |