| Time |
Nick |
Message |
| 01:56 |
|
awitter joined #evergreen |
| 06:00 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
| 07:18 |
|
rjackson_isl_hom joined #evergreen |
| 07:51 |
|
mantis1 joined #evergreen |
| 08:00 |
|
mantis1 joined #evergreen |
| 08:04 |
|
rfrasur joined #evergreen |
| 08:39 |
|
jvwoolf joined #evergreen |
| 08:40 |
|
mmorgan joined #evergreen |
| 08:50 |
|
collum joined #evergreen |
| 09:03 |
|
Dyrcona joined #evergreen |
| 09:57 |
Dyrcona |
Truncate action_trigger.event and a test of the daily a/t runner goes really fast. |
| 10:00 |
|
Christineb joined #evergreen |
| 10:08 |
Bmagic |
Dyrcona++ # burn it all down |
| 11:14 |
Dyrcona |
Well, it still takes a while to churn through 14,286 events. But collecting them into the table was really fast. :) |
| 11:48 |
csharp |
berick: new fix applied to PINES production - so far so good, drone-wise - I'll let you know if we hear complaints |
| 11:53 |
berick |
csharp: cool |
| 12:03 |
|
jihpringle joined #evergreen |
| 13:01 |
jeffdavis |
csharp++ # sometimes you gotta test in production |
| 13:06 |
csharp |
jeffdavis: yep, it's the only way to see if something that only emerges in production actually works! |
| 13:36 |
jeffdavis |
updated bug 1896285 - I think the 3.5 version is OK to go in rel_3_5 |
| 13:36 |
pinesol |
Launchpad bug 1896285 in Evergreen 3.5 "Use batch methods for multi-row grid actions" [Medium,Confirmed] https://launchpad.net/bugs/1896285 |
| 13:46 |
csharp |
jeffdavis: +1 |
| 14:17 |
|
alynn26 joined #evergreen |
| 14:33 |
|
sandbergja joined #evergreen |
| 14:53 |
Bmagic |
jeffdavis: I rolled that patch into our production 3.5 system last night. It's been more stable today |
| 14:58 |
Bmagic |
csharp: did you put opensrf_bug 1912834 on production as well? |
| 14:59 |
|
mantis1 left #evergreen |
| 15:06 |
|
pinesol` joined #evergreen |
| 15:11 |
|
sandbergja joined #evergreen |
| 15:20 |
csharp |
wow - something appears very broken here - something sent apparently hundreds of null ids to a staff search and it went for 6.5 minutes with the accompanying NOT CONNECTED errors |
| 15:21 |
csharp |
this is the activity.log call: https://pastebin.com/VnNexDjf |
| 15:23 |
csharp |
Bmagic: yes, we've applied that too |
| 15:24 |
Dyrcona |
@blame tsbere |
| 15:24 |
pinesol |
Dyrcona: tsbere stole bshum's tux doll! |
| 15:24 |
csharp |
(well, that's the "new fix" I was telling berick about earlier) |
| 15:24 |
csharp |
@seen tsbere |
| 15:24 |
pinesol |
csharp: tsbere was last seen in #evergreen 3 years, 36 weeks, 4 days, 1 hour, 3 minutes, and 17 seconds ago: <tsbere> er, not anoning |
| 15:25 |
berick |
csharp: that api is from the staff catalog. should be a quick fix |
| 15:25 |
Dyrcona |
It's nothing major, just you have to restart open-ils.circ after changing the circ.opac_renewal.use_original_circ_lib global flag. |
| 15:26 |
Dyrcona |
According to git blame, tsbere added the code that caches the setting. |
| 15:26 |
csharp |
Dyrcona: UNBELIEVEABLE |
| 15:26 |
csharp |
berick: awesome |
| 15:28 |
Dyrcona |
The code has been there since 2011, so you'd think I would have known.... |
| 15:28 |
Dyrcona |
@blame Dyrcona |
| 15:28 |
pinesol |
Dyrcona: Dyrcona is probably integrated with systemd |
| 15:28 |
Bmagic |
csharp: did you determine the reason for the cataloging (bucket?) issue? |
| 15:31 |
csharp |
Bmagic: berick did the work, I just applied it this morning |
| 15:31 |
berick |
the issue was the patch needed fixing |
| 15:31 |
Bmagic |
right, so, are you saying the bucket issue was related? |
| 15:31 |
csharp |
the bucket issue? sorry I may not be following |
| 15:32 |
csharp |
what the patch does is throttle concurrent OpenSRF requests so as not to overwhelm services like open-ils.actor |
| 15:33 |
csharp |
so a cataloger adding 50+ items to a bucket was creating 50+ actor drones and if you multiply that by $pines_catalogers, it gets hairy real fast |
| 15:34 |
berick |
csharp: you said something about buckets yesterday when you were finding misc. bugs from the patch |
| 15:34 |
berick |
(though none of the bugs were really bucket related) |
| 15:37 |
csharp |
berick: ah - thanks |
| 15:37 |
Bmagic |
right on - well, I'm going to merge that opensrf branch into production this evening as well. Having bug 1896285 - we noticed an improvement, but today, we still saw a couple of machines get out of control and die |
| 15:37 |
pinesol |
Launchpad bug 1896285 in Evergreen 3.5 "Use batch methods for multi-row grid actions" [Medium,Confirmed] https://launchpad.net/bugs/1896285 |
| 15:37 |
csharp |
Bmagic: yeah, I think the bucket thing was just shrinking resources affecting all kinds of things |
| 15:37 |
csharp |
the complaints were varied but all dealt with batch actions |
| 15:37 |
|
sandbergja joined #evergreen |
| 15:38 |
csharp |
and today has been relatively quiet after applying the latest version of the fix |
| 15:38 |
Bmagic |
throttling the number of requests to 5 at a time should help ALOT |
| 15:38 |
csharp |
yes, system-side has been back to what I think of as "normal" |
| 15:38 |
Bmagic |
that's encouraging, patching now... |
| 15:39 |
berick |
csharp: mind checking something? that null-blast issue, do you see vandelay api calls around the same time (just prior)? |
| 15:39 |
Bmagic |
Thinking back - I noticed a huge impact on the servers moving from XUL to the web based client |
| 15:39 |
berick |
just found that vandelay also uses that api, and may be more likely the culprit |
| 15:40 |
csharp |
berick: looking now |
| 15:40 |
Bmagic |
at that time, we upped the hardware on all of the bricks, fleet-wide, to cope with the onslaught of requests that the web client seemed to be doing. It seems this "shot-gun" blast from the Evergreen web client is not new |
| 15:40 |
berick |
Bmagic: IIRC, the XUL client natively limited the number of XHR requests to something like 8 at a time, so it always had a baked in limit |
| 15:41 |
Bmagic |
berick: I was wondering |
| 15:44 |
csharp |
berick: this looks like the culprit: open-ils.search open-ils.search.biblio.multiclass.query.staff {"limit":1001,"offset":0}, "(keyword:the prophets) site(PINES) |
| 15:45 |
berick |
csharp: huh, was trying stuff like that couldn't make it happen. looks like also protected in the code, but i'm clearly missing something.. |
| 15:45 |
csharp |
I'll try that call from srfsh to see what happens |
| 15:45 |
csharp |
or maybe it won't act the same from srfsh? |
| 15:46 |
csharp |
that came back super fast, so I guess not |
| 15:49 |
berick |
yeah, i can't get the catalog to send nulls, strange |
| 15:50 |
csharp |
open-ils.acq open-ils.acq.purchase_order.retrieve "<REDACTED>",, 7030, {"flesh_price_sum |
| 15:50 |
csharp |
mary":true,"flesh_provider":true,"flesh_lineitem_count":true} |
| 15:50 |
csharp |
that's right before as well |
| 15:51 |
csharp |
well, 2 seconds before |
| 15:52 |
berick |
csharp: does this return 0? select count(*) from vandelay.bib_match where eg_record is null; |
| 15:53 |
berick |
0 is exptected, but just to rule that out |
| 15:53 |
Bmagic |
zero for me |
| 15:53 |
csharp |
0 here too |
| 15:53 |
berick |
k |
| 15:54 |
csharp |
I can confirm that the record IDs before all the nulls all lead to records with "the prophet" somewhere in the record |
| 15:54 |
berick |
csharp: ok, good, that settles that |
| 15:56 |
berick |
csharp: any local diffs to origin/master for Open-ILS/src/eg2/src/app/share/catalog/catalog.service.ts or search-context.ts ? |
| 15:56 |
csharp |
lemme look |
| 15:56 |
csharp |
2021-01-26 14:59:41 brick01-head gateway: [ACT:60271:osrf-websocket-stdio.c:559:1611691160602718] [127.0.0.1] [] open-ils.search open-ils.search.biblio.record.catalog_summary.staff 191, [null,null,null,null,null,null,null,null,null,null] |
| 15:56 |
|
jonadab joined #evergreen |
| 15:56 |
csharp |
another example of fewer |
| 15:56 |
csharp |
it's only happened seven times today |
| 15:57 |
berick |
interesting |
| 15:58 |
berick |
or bib-record.service.ts |
| 15:59 |
csharp |
ah... |
| 15:59 |
csharp |
I think I found the issue |
| 16:00 |
csharp |
to allow searching in the (goddamned) search box on the staff page, I changed "let query = ts.query[idx];" to "let query = decodeURIComponent(ts.query[idx]);" around line 510 or so |
| 16:01 |
csharp |
I have a feeling that's causing an unexpected side effect |
| 16:01 |
csharp |
we may just have to remove that box until we can safely redirect to the eg2 version |
| 16:02 |
csharp |
that's in Open-ILS/src/eg2/src/app/share/catalog/search-context.ts btw |
| 16:05 |
csharp |
apparently we can't win on the OpenSRF throttling either - getting complaints of things not loading in acq - I'll try to get more specifics |
| 16:06 |
csharp |
argh! it was such a smooth upgrade server-side - these piddly issues are going to be the death of me |
| 16:07 |
|
sandbergja joined #evergreen |
| 16:56 |
|
sandbergja joined #evergreen |
| 17:00 |
Bmagic |
csharp: JS cache client-side might account for the patch not "taking hold" on some workstations |
| 17:26 |
|
mmorgan left #evergreen |
| 17:49 |
|
Cocopuff2018 joined #evergreen |
| 18:00 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
| 18:02 |
|
book` joined #evergreen |
| 19:04 |
|
Dyrcona joined #evergreen |
| 19:09 |
|
sandbergja joined #evergreen |
| 20:21 |
|
jonadab joined #evergreen |
| 20:41 |
|
Dyrcona joined #evergreen |
| 21:10 |
|
sandbergja joined #evergreen |
| 21:20 |
|
sandbergja joined #evergreen |
| 23:14 |
|
jamesrf joined #evergreen |
| 23:38 |
|
Cocopuff2018 joined #evergreen |