| Time |
Nick |
Message |
| 07:18 |
|
collum joined #evergreen |
| 08:45 |
|
Dyrcona joined #evergreen |
| 08:59 |
Dyrcona |
I've got two machines running practically the same code on the same Ubuntu release (20.04). One of them gives this error with reports "XMLENT XML Parse Error: mismatched tag." The other works just fine. |
| 09:07 |
Dyrcona |
No differences in eg_vhost.conf as far as reports go. |
| 09:17 |
csharp_ |
ansible++ |
| 09:19 |
csharp_ |
Dyrcona: by "reports" you mean you're seeing that in report outputs? may be too in my own head to grok that |
| 09:21 |
Dyrcona |
Yeah, when I open a report output, I get a 500 error and that's in the log. |
| 09:21 |
Dyrcona |
I mentioned it late yesterday. It's parsing HTML as XML, and <meta charset="utf-8"> is breaking it. |
| 09:21 |
Dyrcona |
But not in production, just on this test VM. |
| 09:22 |
Dyrcona |
I also diffed the branches, and the reports code is the same. |
| 09:26 |
Dyrcona |
Only differences in eg.conf concern a ping file and SSL cipher settings. |
| 09:27 |
csharp_ |
huh |
| 09:28 |
Dyrcona |
I suspect that there must be a patch applied to production that I don't have on the test VM or vice versa. |
| 09:30 |
Dyrcona |
Only code difference is array_agg vs. array_accum in Booking.pm because one has a patch applied that the other doesn't. (I'll need that patch, but not for this problem.) |
| 09:36 |
Dyrcona |
I thought this had come up before, but I can't find it in my IRC logs. |
| 09:38 |
csharp_ |
it sounds familiar |
| 09:43 |
Dyrcona |
I'm taking care of the array_accum patch before I forget. |
| 09:44 |
|
sandbergja joined #evergreen |
| 09:50 |
Dyrcona |
I tried searching Lp with fixed committed bugs enabled, and nothing came up there either. |
| 09:51 |
Dyrcona |
Production could have a patch applied from elsewhere... |
| 09:51 |
Dyrcona |
I suppose that I can test that locally. I do have the list of commits applied. |
| 09:54 |
sandbergja |
Has anybody run into a situation where logging in to the staff client starts to fail most (but not all) of the time? The browser console complains about open-ils.circ.offline.data.retrieve failing because it can't connect to the server to get stat cats. This is with redis, recent main with some bonus commits that don't seem related, and ubuntu |
| 09:54 |
sandbergja |
jammy. osrf_control --diagnostic reported that everything was happy. osrf_control --restart-all fixed the issue (at least for now...) |
| 09:54 |
sandbergja |
The error: https://gist.githubusercontent.com/sandbergja/c938c9a63286bededec1bb5a9e18a162/raw/bfed9be845978f0a21b518c53fe896d056585394/log%2520entries%2520-%2520login%2520issue |
| 09:54 |
sandbergja |
(it also is present in the server logs, which didn't have any additional details) |
| 09:55 |
berick |
sandbergja: drone/worker lost connection to postgres |
| 09:56 |
sandbergja |
berick++ |
| 09:56 |
berick |
i've seen this happen w/ automatic os updates, but could be other causes |
| 09:56 |
sandbergja |
gotcha, I could totally see that |
| 09:57 |
sandbergja |
Is there a good way to monitor for drones that have lost their db connection? |
| 09:57 |
berick |
sudo apt remove unattended-upgrades # to disable auto updates on critical servers |
| 09:58 |
berick |
(for ubuntu, anyway) |
| 09:58 |
berick |
sandbergja: hm, just the logs, i think |
| 09:58 |
sandbergja |
berick++ |
| 09:59 |
Dyrcona |
csharp_: I applied the production patches and no unexpected differences. I'm stumped as to why one server is trying to parse the reports html as xml and the other isn't. |
| 09:59 |
Dyrcona |
Of course, the server code could actually be different than what's in my local branch.... |
| 10:02 |
csharp_ |
maybe something where the perl libs are out of sync? |
| 10:03 |
csharp_ |
on upgraded servers the /usr/local/share/perl/{version} directories are preserved and I've seen upgraded servers try to use the older versions |
| 10:03 |
csharp_ |
OS upgrades, I mean |
| 10:04 |
csharp_ |
also seen OSes hang onto older APT perl packages |
| 10:09 |
Dyrcona |
csharp_: Thanks. I'll check that, but I think both of these are mostly fresh, not release-upgraded. |
| 10:09 |
Dyrcona |
Yeah, both only have /usr/local/share/perl/5.30.0 |
| 10:14 |
csharp_ |
Dyrcona: same version of clark-kent.pl? |
| 10:15 |
Dyrcona |
csharp_: I'll copy the files and diff them but I'm pretty sure they are the same. |
| 10:15 |
Dyrcona |
Both servers produce output with the <meta charset='utf-8'> tag, only the one complains about it. I think it has to be something in EGWeb. |
| 10:17 |
csharp_ |
and output from the "good" server fails on the bad and vice versa? |
| 10:17 |
Dyrcona |
csharp_: I haven't tried that, but will. |
| 10:17 |
Dyrcona |
I suspect output from the good server will fail on the bad one, and not vice versa. |
| 10:18 |
* csharp_ |
likes ignoring his own problems to help others with theirs :-) |
| 10:18 |
csharp_ |
yeah, I think the same thing |
| 10:22 |
Dyrcona |
Yeahp. reports from the good server error on the bad one. |
| 10:22 |
csharp_ |
Dyrcona++ |
| 10:22 |
Dyrcona |
I hesitate to go the other direction because it's a production machine. |
| 10:23 |
csharp_ |
SORRY I BROAK UR REPORTZ |
| 10:23 |
Dyrcona |
Looks like it could replace existing output. |
| 10:27 |
Dyrcona |
There has to be some kind of difference. |
| 10:30 |
Dyrcona |
I'm going to try a git clean -xfd and rebuild/install Evergreen. |
| 10:34 |
Dyrcona |
I doubt that different Postgres versions would affect this. I don't think just looking at the report output connects to Pg in anyway. |
| 10:38 |
Dyrcona |
Well, that didn't help.... |
| 10:38 |
Dyrcona |
*confused unga bunga* |
| 10:44 |
Dyrcona |
hm. I think I just might need another db patch. |
| 10:50 |
Dyrcona |
git supposedly makes this easier, but the patch workflow doesn't really help. I can't just do git branch --contain <commithash> and find all of the branches with the same commit. |
| 10:50 |
Dyrcona |
I have to resort to grepping the logs for Lp numbers, etc. |
| 10:50 |
Dyrcona |
I'd like to switch to merge. |
| 10:57 |
Dyrcona |
Looks like I need a bunch more patches, but finding them all is proving to be a PITA. |
| 11:02 |
Dyrcona |
I have to log --grep for keywords like postgresql, etc. |
| 11:20 |
|
Christineb joined #evergreen |
| 11:21 |
jeffdavis |
We've run into the lost db connection thing before too. I've opened bug 2098507 as a feature request. |
| 11:21 |
pinesol |
Launchpad bug 2098507 in Evergreen "Respond gracefully when database connection is lost" [Undecided,New] https://launchpad.net/bugs/2098507 |
| 12:52 |
Dyrcona |
Well, applying all of the Pg commits that I could find did not help my issue with the XML parser error on the test vm. |
| 13:08 |
Dyrcona |
I am tempted to delete the VM and rebuild it. |
| 13:29 |
Dyrcona |
And that's what I'm doing. |
| 13:57 |
* Dyrcona |
is about to find out if OpenSRF 3.3.2 works with Evergreen 3.7.4. |
| 14:06 |
jeffdavis |
Looking at bug 2073561 ... |
| 14:06 |
pinesol |
Launchpad bug 2073561 in Evergreen "Incorrect content in the config.coded_value_map after applying the upgrade script from 3.12.3 to 3.13.0" [High,Confirmed] https://launchpad.net/bugs/2073561 |
| 14:06 |
jeffdavis |
For sites that haven't upgraded to 3.13.0+ yet, it seems like the best approach is simply to skip upgrade 1416 altogether? |
| 14:07 |
Dyrcona |
jeffdavis: I think so, yes. |
| 14:07 |
Dyrcona |
Can we recall it somehow? I suppose with another Lp bug... |
| 14:10 |
jeffdavis |
In that case my inclination would be to delete 1416 from the 3.13.0 version upgrade script in the next round of 3.13+ releases. |
| 14:10 |
Dyrcona |
Open a Lp bug and see what others think. |
| 14:11 |
csharp_ |
I didn't have the bandwidth to read through all the functions - is that something we can skip? |
| 14:11 |
jeffdavis |
I haven't checked thoroughly and won't be able to for a while (vacation next week)... |
| 14:11 |
csharp_ |
ah |
| 14:12 |
csharp_ |
have a nice time! |
| 14:12 |
jeffdavis |
... but if I understand Llewellyn's input correctly, fresh installs are unaffected by this issue because they don't install 961.data.marc21-tag-tables.sql at all. |
| 14:12 |
jeffdavis |
And 1416 is identical to 961.data.marc21-tag-tables.sql. |
| 14:12 |
csharp_ |
we are upgrading from 3.12 to 3.14 on Saturday |
| 14:12 |
csharp_ |
oh |
| 14:12 |
csharp_ |
hmm - maybe we can save ourselves the pain |
| 14:13 |
jeffdavis |
So simply omitting 1416 would in theory mean that an upgraded system is a match for a clean install. |
| 14:13 |
jeffdavis |
I haven't tested this yet *at all* so please don't rely on me being right about this :) |
| 14:13 |
jeffdavis |
(and good luck with the upgrade either way!) |
| 14:13 |
csharp_ |
thanks! |
| 14:14 |
Dyrcona |
I think you can skip it without danger. |
| 14:17 |
csharp_ |
Dyrcona++ # gonna skip it! |
| 14:25 |
Dyrcona |
Ugh. Half of a script just blew up because i did one of the steps early. |
| 14:27 |
Dyrcona |
Well, not quite half. |
| 14:28 |
Dyrcona |
I think I should be able to test OpenSRF now. |
| 14:30 |
Bmagic |
does the SIP 98/99 keepalive refresh the memcached authtoken? |
| 14:30 |
Dyrcona |
Bmagic: Not sure. |
| 14:50 |
jeffdavis |
Bmagic: Are you using SIPServer or SIP2Mediator? |
| 14:51 |
Bmagic |
SIPServer |
| 14:53 |
jeffdavis |
It looks to me like SIPServer doesn't check any authtoken when responding to 98 (looking at sub send_acs_status in Sip/MsgType.pm) |
| 14:54 |
jeffdavis |
or rather, it doesn't talk to EG about anything so there's no point where an authtoken is verified for 98/99 |
| 15:05 |
|
bshum joined #evergreen |
| 15:05 |
Dyrcona |
Welcome back, bshum! |
| 15:05 |
bshum |
Huzzah! |
| 15:06 |
bshum |
Dyrcona++ # moral support |
| 15:07 |
Dyrcona |
backups++ |
| 15:22 |
csharp_ |
bshum++ # supportive morals |
| 15:29 |
Bmagic |
jeffdavis++ |
| 15:29 |
Bmagic |
bshum++ # wb |
| 15:30 |
Dyrcona |
Was gonna say that my installation worked on the first try, but I have to remove the legacy JSON gateway from the Apache config. |
| 15:31 |
Dyrcona |
Guess that's a side effect of using OpenSRF 3.2.3. |
| 15:33 |
Dyrcona |
Huh. I thought this branch had all of our customizations in it. |
| 15:34 |
* Dyrcona |
is more confuzzled than before. |