Time |
Nick |
Message |
00:02 |
|
cmalm joined #evergreen |
00:22 |
|
cmalm joined #evergreen |
00:43 |
|
cmalm joined #evergreen |
01:03 |
|
cmalm joined #evergreen |
01:22 |
|
cmalm joined #evergreen |
01:43 |
|
cmalm joined #evergreen |
02:09 |
|
cmalm joined #evergreen |
02:31 |
|
cmalm joined #evergreen |
02:52 |
|
cmalm joined #evergreen |
03:09 |
|
cmalm joined #evergreen |
06:01 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
07:21 |
|
rfrasur joined #evergreen |
08:33 |
|
alynn26 joined #evergreen |
08:37 |
|
mantis1 joined #evergreen |
08:44 |
|
Dyrcona joined #evergreen |
08:48 |
|
cmalm joined #evergreen |
08:48 |
|
rjackson_isl joined #evergreen |
08:52 |
Dyrcona |
I've noticed something about the autorenewal notice emails that get stuck in collected state. The user_data indicates that they have no renewals remaining, so there might be a bug in handling that condition. |
09:20 |
|
mantis2 joined #evergreen |
09:26 |
|
mantis1 joined #evergreen |
09:30 |
|
jvwoolf joined #evergreen |
09:58 |
JBoyer |
Dyrcona, re: autorenewals, did the ones that failed have an unequal number of auto- and regular renewals? (like 0 renewals, but 1 autorenewal, etc.) |
10:03 |
Dyrcona |
The first one has both renewal_remaining and auto_renewal_remaining of 0. I'm actually doing something else related to this at the moment, and I've not started digging into the data, yet. I'm coping the relevant syslogs to where they won't get rotated. |
10:03 |
Dyrcona |
I can't trust the entries on our syslog server because UDP. |
10:03 |
Dyrcona |
If I use TCP, it slows production down. :( |
10:17 |
Dyrcona |
Well, syslog is useless. Shows the event being created, getting updated a couple of times state = found -> collecting -> collected, then nothing. |
10:20 |
Dyrcona |
open-ils.triggere_stderr.log is full of this: Use of uninitialized value in subtraction (-) at /usr/local/share/perl/5.22.1/OpenSRF/AppSession.pm line 952. |
10:21 |
Dyrcona |
Not sure if it's related, since it happens today, too, and nothing looks broken today. There are no dates in the stderr logs, but the file's datestamp is Dec 27 05:24 |
10:26 |
Dyrcona |
Well, hello: Caught error from 'run' method: Exception: OpenSRF::EX::JabberDisconnected 2019-12-26T06:26:48 OpenSRF::Application /usr/local/share/perl/5.22.1/OpenSRF/Application.pm:240 JabberDisconnected Exception: This JabberClient instance is no longer connected to the server |
10:27 |
JBoyer |
That would cause some trouble. |
10:27 |
Dyrcona |
But, that looks like a symptom, not a cause. It seems to happen every time I reset this group of events back to pending. |
10:28 |
JBoyer |
As for the logging, I don't know for certain if it would help or if you've looked into it, but you can set rsyslog up to use a disk cache on the client in case that may help with the tcp bottleneck. |
10:28 |
Dyrcona |
And, again at 9:30 pm when some other events got stuck in collected state. |
10:28 |
JBoyer |
meaning every time you run that batch of events you get the no longer connected messgage? |
10:28 |
Dyrcona |
Well, messages sent to the server are also logged in the clients' syslog. |
10:29 |
Dyrcona |
JBoyer: Yes. The times correspond to the log entries in syslog for this event and the times that I ran a db update to set the events back to pending. |
10:30 |
Dyrcona |
There are more of those messages, obviously. |
10:31 |
Dyrcona |
I did a grep for -v 952 to find those messages. At the very end I get Caught error from 'run' method: Unable to update event group state at /usr/local/share/perl/5.22.1/OpenILS/Application/Trigger/EventGroup.pm line 103. |
10:31 |
Dyrcona |
with no timestamp. |
10:31 |
Dyrcona |
Nothing to identify the event, either. It could be from this morning or from last night. |
10:32 |
Dyrcona |
Doesn't appear in the syslog, so not actionable. |
10:33 |
JBoyer |
I would expect you to see this elsewhere already but just in case, I have seen circs with wildly bad dates (like 17 CE) cause serious problems for the backend. Not sure you'd get that particular jabber error though. |
10:34 |
Dyrcona |
Well, interesting thing about the one from 9:30 pm. It appeared to take 3 hours or so to die, and was holding the lock file the whole time, because I got emails every half hour from 10:00 pm to 12:30 am saying the lock file was there. |
10:35 |
pinesol |
[evergreen|Zavier Banks] lp1827942: prevent clicking outside an Angular modal from closing it - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=92a6def> |
10:35 |
Dyrcona |
JBoyer: I can check the dates of the 92 circulations, but the first one looked OK. |
10:38 |
JBoyer |
I've seen that cause problems specifically for the fine generator, but I would expect it to have a lot quicker effect than 3-ish hours. :/ |
10:39 |
Dyrcona |
I should probably check the 34 auto-renewals that errored. I wouldn't be surprised if the 4 that generated the auto-renewal notice event are causing the problem. |
10:43 |
Dyrcona |
Interesting.... These all have a checkin_scan_time equal to the xact_start. |
10:47 |
Dyrcona |
Looks like the errored auto-renewal events have the same on the circulations also. I'm going to dump these to csvs where it's more obvious. |
10:50 |
Dyrcona |
Eh, no. I was mistaken. Opening the spreadsheet shows that I read the wrapped output in the console incorrectly, and what I thought was the last column was the penultimate column. |
10:51 |
Dyrcona |
What I thought was checkin_scan_time is actually the create_time. |
10:53 |
Dyrcona |
Some of the errored ones look like they were checked in already. Perhaps auto-renew sometimes tries to process circulations twice? |
10:53 |
Dyrcona |
Only way to know.... dump the event.id and event.add_time as well as circulation for all of the auto-renew events. |
10:54 |
|
sandbergja joined #evergreen |
10:57 |
Dyrcona |
I didn't even have to sort by circulation id to find that two renewal events were created for the same circulation: 91877668. |
10:59 |
Dyrcona |
And, no. I'm apparently wrong again and need new glasses. :( |
10:59 |
|
stephengwills joined #evergreen |
11:01 |
rjackson_isl |
in reviewing EI logs from our utlity server (processing autorenewals) and the actual triggered events entries and related circs in the db I see we have events that were not processed as late as 11 AM when the processing actually kicks off at 3 AM :( |
11:01 |
rjackson_isl |
some of those circs get renewed by staff of course that late in the day |
11:01 |
Dyrcona |
rjackson_isl: Yeah. Ours startes at 3:05 am and was still chugging away 3 hours later. |
11:02 |
Dyrcona |
I also reset ones that were stuck in collected state, but some of those never finished. |
11:02 |
rjackson_isl |
ours are managing to chug thru the whole lot but really late into the open hours for some libraries |
11:03 |
Dyrcona |
As for this slip up in reading the spreadsheet, I mistook ...698 for ...668. I realized my mistake after sorting and increasing the font size. |
11:03 |
Dyrcona |
rjackson_isl: Do you use a parallel setting for action trigger? We use 3. |
11:04 |
rjackson_isl |
is that in the cron execution entry? |
11:04 |
rjackson_isl |
if so this is all we do: /openils/bin/action_trigger_runner.pl --process-hooks --run-pending --granularity Daily |
11:07 |
Dyrcona |
rjackson_isl: It's in opensrf.xml configuration for the open-ils.trigger drone, IIRC. |
11:07 |
Dyrcona |
I know it's in opensrf.xml. |
11:07 |
Dyrcona |
And "drone" should have been written "service." |
11:09 |
pinesol |
[evergreen|Galen Charlton] LP#1827942: follow-up to fix a couple issues - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=4a67e5d> |
11:09 |
rjackson_isl |
is it in the <app_settings> ? if so we have 3 for collect and react |
11:10 |
Dyrcona |
JBoyer: I don't see any crazy dates in the auto-renewal events and circulations that I've dumped, and that was all 37,247 of them. :) |
11:10 |
Dyrcona |
rjackson_isl: Yeah, that's it. How many auto-renewals do you do in a day, event_def 124? |
11:11 |
rjackson_isl |
we have unique event defs per system? |
11:11 |
Dyrcona |
rjackson_isl: Also, that section is not commented out? It's there by default and commented out. |
11:11 |
stephengwills |
is there any harm in killing off a bunch of 7 day old SELECT auditor.clear_audit_info(); calls? |
11:12 |
rjackson_isl |
nope - it is not commented out |
11:13 |
Dyrcona |
rjackson_isl: bshum mentioned splitting them up by library to me yesterday in private chat regarding courtesy notices. I'm considering it. I'm not sure what it buys us unless I also split them up across multiple servers. |
11:13 |
Dyrcona |
And, then the database gets hammered the same or more if I run them close together. |
11:15 |
Dyrcona |
stephengwills: Probably not, but you might want to see if you can figure out why they're taking so long. |
11:15 |
Dyrcona |
And, spreading autorenewals out throughout the day to reduce load runs into other issues, as rjackson_isl has mentioned. |
11:31 |
Dyrcona |
I'm going to have to figure out how that parallel setting works because it doesn't fork and it doesn't use threads, at least at the O/S level. |
11:32 |
Dyrcona |
I know the Perl module it uses, but I need to look into the guts to see what that does. |
11:42 |
gmcharlt |
Dyrcona: the very brief version, with caveat that it's been a couple years at least since I looked at the code: multisession doesn't fork itself as a OpenSRF client; it instead fires off parallel requests and waits for resposnes |
11:43 |
gmcharlt |
the forking as such happens with the drones that handle the requwests |
12:11 |
Dyrcona |
gmcharlt: Thanks. That's kind of what I "remembered." I'm going to have another look just the same. |
12:18 |
Dyrcona |
And, for the logs: Yes, what gmcharlt said is pretty much what OpenSRF::MultiSession does. It makes requests up to its capacity (given as a parameter) and then waits until some finish before starting new requests. |
13:07 |
rfrasur |
csharp, are you around? |
13:08 |
rfrasur |
alternatively, are there any acquisitions savants lurking about? |
14:04 |
rjackson_isl |
rfrasur needs a mirror |
14:04 |
rfrasur |
psh |
14:12 |
Dyrcona |
:) |
14:14 |
gmcharlt |
berick: JBoyer: et al. pull request on bug 1857710 now available; fixes a regression that breaks the Angular client |
14:14 |
pinesol |
Launchpad bug 1857710 in Evergreen 3.3 "Angular client whitescreen on Firefox" [Critical,New] https://launchpad.net/bugs/1857710 |
14:15 |
gmcharlt |
(for Firefox) |
14:54 |
|
sandbergja joined #evergreen |
15:02 |
* JBoyer |
belatedly realizes that the reason his test server is misbehaving is that it's in the process of being rebuilt... This makes testing difficult. |
15:31 |
|
mantis1 left #evergreen |
15:44 |
pinesol |
[evergreen|Galen Charlton] LP#1857710: fix Angular client whitescreen on Firefox - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=1df6edd> |
16:04 |
pinesol |
[evergreen|Katlyn Beck] lp1712644 Prevent check out due date in past - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=4dbb87a> |
16:10 |
pinesol |
[evergreen|Dan Briem] LP#1780283 Checking One Bill Checks Them All - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=ad26b08> |
16:10 |
|
sandbergja joined #evergreen |
16:17 |
* JBoyer |
makes angry noises, stomps off to LP |
16:18 |
pinesol |
[evergreen|Bill Erickson] LP1848778 Use consistent MARC breaker delimiter - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=9776b89> |
16:24 |
pinesol |
[evergreen|Mike Risher] lp1843640 Standing Penalty Followup - <http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=14121ee> |
18:01 |
pinesol |
News from qatests: Testing Success <http://testing.evergreen-ils.org/~live> |
18:08 |
|
abowling1 joined #evergreen |
18:25 |
|
abowling joined #evergreen |
19:20 |
|
sandbergja joined #evergreen |
20:16 |
dbs |
Thanks for the fixes to rel_3_4! |
21:18 |
|
cmalm joined #evergreen |
21:43 |
|
cmalm joined #evergreen |
22:11 |
|
cmalm joined #evergreen |
22:34 |
|
cmalm joined #evergreen |
22:54 |
|
cmalm joined #evergreen |
23:24 |
|
cmalm joined #evergreen |
23:49 |
|
cmalm joined #evergreen |