Evergreen ILS Website

IRC log for #evergreen, 2021-12-07

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat

All times shown according to the server's local time.

Time Nick Message
06:00 pinesol News from qatests: Testing Success <http://testing.evergreen-ils.org/~live>
07:30 rjackson_isl_hom joined #evergreen
07:47 collum joined #evergreen
08:36 mantis joined #evergreen
08:36 mmorgan joined #evergreen
08:53 rfrasur joined #evergreen
09:34 jvwoolf joined #evergreen
09:36 Dyrcona joined #evergreen
09:50 Dyrcona I'm getting "DATABASE UPDATE FAILED" for some attempted vandelay imports, but PostgreSQL is not logging any errors? Anyone have any suggestions?
09:55 berick Dyrcona: i'd check for errors in opensrf logs.  sometimes cstore/pcrud will error before they actually issue any queries
10:15 Dyrcona berick, i was about to do that but got sidetracked by somethings else.
10:20 Dyrcona Trouble with that is having action triggers running produces a lot of "noise."
10:31 Dyrcona marc_stream_importer stopped working on our training server after we upgraded to 3.7.2, and I'm trying to figure out why. It's more difficult when the hold targeter is running and it adds about 10MB of log data in a minute.
10:40 Dyrcona So, after stopping everything, dropping the crontab, restarting services, starting the stream importer, and trying again, still nothing. The queued bib record entry is there with a database update failed, but no error in the postgresql log and none in the syslog, or /openils/var/log/* files.
10:41 berick Dyrcona: nothing in vandelay.queued_bib_record.error_detail (or .import_error)
10:41 berick ?
10:43 rjackson_isl_hom joined #evergreen
10:44 Dyrcona I was looking at vandelay.queued_bib_record.error_detail  and seeing this "DATABASE_UPDATE_FAILED : The attempt to write to the DB failed." That's why I asked.
10:45 berick ah, we've come full circle
10:45 Dyrcona However, it looks like the restart of services may have done the trick. The last two records I sent don't have error detail. I'm going to add more fields to my select.
10:50 Dyrcona Ah, yes... Bilbio Record-Entry, Frodo's librarian cousin.
10:51 Dyrcona So, it didn't automatically create a bib record or appear to update any. I was kind of expecting that, but I don't really know what to expect. I'm sure they'll tell me it is still broken.
10:55 Dyrcona Nope. import_time is null. Hasn't been a successful import since 2021-11-30 11:15:15.851648-05. So, it worked once a few days ago, but not since, and I have not touched it until this morning.
10:57 Dyrcona This Evergreen 3.7.2 on Pg 10, though it worked on Evergreen 3.5.3 with Pg 10.
10:57 Dyrcona IDK, maybe I'll try a reinstall later. Sometimes, rm -rf /openils/* and reinstalling everything helps.
11:00 Dyrcona Still curious that I get a DATABASE_UPDATE_FAILED and no error in the postresql-10-main.log.
11:01 Dyrcona It logged my bilbio.record_entry typo with a relation does not exist.
11:10 Dyrcona I wonder if my patch for bib match quality ratio is interfering, but our cataloger specified a ratio of 1.0 and it looked dead simple to add it.
11:11 Dyrcona I think the patch was there on 3.5.3. It was in the branch, but doesn't mean the importer with that feature got installed.
11:25 Dyrcona Think I'll back that code out and try it that way.
11:25 Dyrcona Well, I'll edit it in situ on training.
11:32 Dyrcona Nah, didn't help. :(
11:33 Dyrcona Another mysterious database error.
11:34 Dyrcona I ran autogen during all of this at one point, too. I'm pretty sure the IDL is up to date.
11:40 Dyrcona Yeah, the installed IDLs are the same as in the branch....
12:16 jihpringle joined #evergreen
12:56 Dyrcona berick: I'm starting to think that this could be a case of something unexpected happened, and we're falling back to reporting a database error happened on the presumption that is the problem. IIRC, the XUL client used to do that in some of its switch/case type error checking things. The default/else was to throw a network error happened.
13:10 Dyrcona Oh ho! I missed this earlier: Dec  7 10:35:20 training open-ils.vandelay: [ERR :21108:CStoreEditor.pm:155:1638891159210936] editor[1|1873642] error rolling back database transaction
13:12 Dyrcona 2021-12-06 10:35:23.668 EST [19925] evergreen@evergreen WARNING:  there is no transaction in progress
13:14 Dyrcona The fact that they're 3 seconds apart kind of bugs me. I ignored those before in the PostgreSQL logs because they're warnings, not errors in PG.
13:29 Dyrcona I wonder if it is failing to update the queued_bib_record  table during the import process.
14:12 jihpringle joined #evergreen
15:05 Dyrcona So, I gave up on the previous issue for a bit, but while watching command output scroll by on a different server, I started wondering if maybe the "did you mean" code is slowing down the creation and updates of bibs to the point where the Vandelay code ends up timing out.
15:05 alynn26 Dyrcona, that had that issue with it in SITKA
15:06 jeffdavis yeah there's a bug for it
15:08 jeffdavis bug 1947173 - there's a fix but it's not in 3.7.2
15:08 pinesol Launchpad bug 1947173 in Evergreen 3.7 "Did You Mean Symspell dictionary updates can significant slow record ingest" [High,Fix committed] https://launchpad.net/bugs/1947173
15:08 Dyrcona alynn26++ jeffdavis++
15:09 Dyrcona At one point today, I was thinking there was a bug that I might want to look at.
15:09 Dyrcona I'll throw that patch on our training server and see what happens.
15:13 mmorgan So back here, I asked about white screen issues: http://irc.evergreen-ils.org/​evergreen/2021-11-18#i_495687
15:14 mmorgan We've had scattered reports of the same thing from many of our libraries. Some workstations, not all.
15:15 alynn26 we have traced most of our white screen issues down to Hatch and Chrome. We do not see them with just Chrome or hatch and firefox.
15:16 mmorgan We've had some success disabling all plugins and add ons that are not used for Evergreen, but we're still not sure what the exact issue is. This morning I observed the phenomenon on my own workstation.
15:17 mmorgan It does seem to have something to do with Hatch. My Chrome has the Hatch Chrome plugin installed, but Hatch is not enabled in Evergreen.
15:18 mmorgan This doc shows screenshots of what the screens look like, and the console: https://docs.google.com/document/d/1d5dC1LhgZgPb​XeFDWcmbCjRCotFl3fZSC5-GlJVyBic/edit?usp=sharing
15:18 jeff do you have steps that you can reliably reproduce?
15:18 mmorgan When we see the whilte screen, this is the last line in the console:
15:18 mmorgan sending to Hatch: {"key":"eg.workstation.all"​,"action":"get","msgid":1}
15:20 mmorgan I didn't want to destroy my test environment, but I *think* this is happening only on angularjs screens.
15:20 mmorgan Any thoughts?
15:20 jeff are there any clues in hatch.log?
15:20 alynn26 That is what we are seeing.  We are also seeing it with some of the Angular screens. but not much.
15:22 mmorgan dumb question time: Where will I find hatch.log?
15:22 csharp_ mmorgan: you can also look at the "background page" for the Hatch extension
15:22 jeff on Windows, it will likely be in %userprofile%\.evergreen
15:23 jeff on masOS/Linux, likely in ~/.evergreen
15:23 jeff you can usually enter %userprofile% in Start -> Run
15:23 jeff unless you've changed the location in logging.properties
15:24 jeff or unless your file permissions on logging.properties prevent it from being read, in which case I forget what the default is. might be "no log"
15:26 berick cd 'C:\Program Files(x86)\Hatch' ; \.hatch.bat test
15:27 berick can also be useful
15:27 mmorgan Hmm. Not seeing a .evergreen directory for my userprofile
15:28 berick correction: .\hatch.bat
15:29 mmorgan csharp_: I see the background page for the Hatch extension, but don't see anything that looks like an error there.
15:32 JBoyer mmorgan, the Hatch logs are in C:\ProgramData\Hatch\hatch.log. ProgramData may be hidden since it's not for touching.
15:33 mmorgan JBoyer: Ok, thanks! hatch.bat also just told me that :)
15:34 jeff ah. THAT must be the default when the native message host can't read logging.config. :-)
15:35 JBoyer That's the default period unless you change it, since "you" don't have any hatch settings, only the system: https://git.evergreen-ils.org/?p=Hatch.git;a=co​mmit;h=0c0265fa967f7038818639a92b8d099401653a18
15:36 jeff ah. lies, damned lies, and comments in config files.
15:39 mmorgan Went to Checkin, see the white screen. hatch.log shows:
15:39 mmorgan 2021-12-07 15:35:31 FINEST MessageWriter sending: {"clientid":155,"msgid":1,​"message":"OK","content":
15:39 mmorgan followed by all the workstations I have registered (of which there are a lot)
15:40 mmorgan "status":200}
15:41 jeff and no corresponding "Hatch responded to message ID 1" in the browser console?
15:42 mmorgan When I refresh the screen, it loads. I refresh my view of the log, and see exactly the same entry after the screen successfully loads.
15:43 mmorgan jeff: I think that is true, but will confirm.
15:43 jeff what version of Evergreen and Hatch are you running?
15:44 jeff guessing Hatch 0.3.2, but not wanting to assume.
15:45 mmorgan jeff: Confirmed: no corresponding "Hatch responded to message ID 1" in the browser console.
15:47 mmorgan I refresh, the screen loads, and I see "Hatch responded to message ID 1" in the browser console.
15:48 mmorgan jeff: Hatch 0.3.2
15:50 mmorgan I logged into the client on 3 different servers this morning, production, test, and dev. I am no longer seeing the problem in production, but am still seeing it in test and dev.
15:51 mmorgan Neglected to say that this was all in the same Chrome window, different tabs, and all showed the problem.
15:53 berick i wonder if it's getting hung up on something else..
15:53 berick mmorgan: go to chrome://inspect/#workers and click on 'inspect' for https://HOSTNAME/js/ui/defaul​t/staff/offline-db-worker.js
15:54 berick see if anything in there complains when the page fails to load
15:57 JBoyer mmorgan, what's "a lot" of workstation names? I wonder if there may be some kind of data size limit for native extension communication.
15:57 mmorgan Hmm. When I go to chrome://inspect/#workers, then navigate to the checkout page that doesn't load, I DO NOT SEE that server listed. I refresh the checkout page, it loads, and now I see that server listed.
15:58 JBoyer or it's that. :)
15:58 jeff mmorgan: what version of Evergreen? did you mention already?
15:58 mmorgan 3 servers, 3 versions :)
15:59 berick might have to refresh the /inspect/ page after opening the checkout tab
16:00 mmorgan 3.6.4, 3.7.2 and another flavor of 3.6, I believe.
16:01 mmorgan berick: the inspect page seems to refresh itself. The listed servers blink on and off depending on whether the screens load or not.
16:01 Dyrcona If I apply that patch for the symspell stuff on a system where it has already been ingested once before, do I have to do it again?
16:02 berick mmorgan: hm, it's inconsistent for me.  just tested again and had to refresh.  *shrug*
16:03 jeff i wonder what you'd see if you go to chrome://extensions, enable Developer mode, and then select the "background page" link for Hatch Native Messenger.
16:04 mmorgan berick: tried refreshing several times when on an incomplete screen. The server doesn't show under Shared workers.
16:05 jeff I get the behavior you've described 100% of the time when I have the hatch browser extension installed but no native messaging host (no Java bits). 100% as in it's not intermittent / fixed with a reload, it just fails every time.
16:06 berick mmorgan: ok, i see.  my next step would be: get to a successfull page load -> open offline inspect -> refresh and hope it breaks -> then see if the offline inspector shows anything interesting
16:06 berick refresh the EG page, i mean, and hope it breaks
16:06 jeff so I wonder if when inspecting the extension you'll have any Hatch disconnected / Reconnecting to hatch / Connecting to native messaging host... type logs.
16:07 Dyrcona Looks like I do....
16:08 jeff I just wonder if (given a not-intentionally-broken environment like I just created) the extension console logs are different when you see / don't see the symptoms.
16:09 jeff I also suspect that this is going to be annoying to us, because I'm fairly certain that we have chrome profiles where Hatch (the extension) is present but the native messaging host is not.
16:09 jeff Hrm.
16:09 mmorgan jeff: I've seen that behavior too, but this is definitely different.
16:10 * jeff nods
16:10 jeff I'm not saying that your issue is caused by the native messaging host not being present -- it wouldn't ever work then. I just wonder if the extension's console log messages give you any clues.
16:10 berick jeff: indeed i have to disable the hatch extenion when i hop on my mac.  we probably need an escape *cough* hatch
16:10 jeff berick: *groan* but yes.
16:11 berick if nothing else, a timeout on the first hatch query -- it should be fast -- before marking in inaccessible
16:12 berick hm, no, it has to be intentional
16:12 jeff don't set isAvailable unless you can ping the native messaging host itself through the extension?
16:13 berick ... or fail very loudly so you know you don't have access to your Hatch if you are accustomed to having access
16:13 berick jeff: yeah pretty much
16:13 mmorgan berick: Ok, got a successful page load on checkout, opened Inspect, navigated to checkin, which broke. Then got a pop-up:
16:14 mmorgan DevTools was disconnected from the page. Once page is reloaded, DevTools will automatically reconnect.
16:15 mmorgan I reload, the screen loads, and the console is now logging again.
16:15 berick mmorgan: ok.  that popup is normal for a reload, so no surprise there.  if it's the last thing you see there, then the worker probably isn't the problem
16:15 berick ugh
16:16 berick i've never seen this type of behavior where Hatch says it's resonding but it's not
16:17 berick which makes me think something else might be gumming up the works before the browser client can check for hatch responses
16:17 jeff mmorgan: are you able to share a screenshot of the extension console log contents when you have the problem / don't have the problem?
16:18 mmorgan jeff: Just a sec.
16:18 * mmorgan has waaay too many windows open now :)
16:18 jeff :-)
16:22 mmorgan jeff: I added the screenshot to the top of https://docs.google.com/document/d/1d5dC1LhgZgPb​XeFDWcmbCjRCotFl3fZSC5-GlJVyBic/edit?usp=sharing
16:23 mmorgan All the entries shown on the screenshot are the result of the successful load. The popup happens when I go to a screen that does not load.
16:23 jeff Hrm. that doesn't look like the console for the extension.
16:24 berick yeah that's shared worker console
16:24 jeff did you get there by going to chrome://extensions, ensuring that Developer Mode was on, and then selecting "background page" for the Hatch extension?
16:24 mmorgan Oh. Right. Too many windows :) Just a sec.
16:30 mmorgan Sorry, called away, just another sec.
16:30 alynn26 I am seeing it constantly with Evergreen 3.7.1 /Chrome Version 95.0.4638.69/ Hatch 0.3.2
16:32 alynn26 This is all I see in the Extension Console. new port connected with id 327
16:32 alynn26 extension.js:113 Removing port 327 on tab disconnect
16:32 alynn26 extension.js:92 Received message from browser on port 327
16:33 mmorgan jeff: Just replaced the image in the doc. The 3 highlighted rows are the result of the unsuccessful load, the subsequent 3 are the successful load.
16:34 jvwoolf left #evergreen
16:35 jeff hah!
16:37 jeff interesting. I wonder if the onDisconnect listener *inside* the onConnect listener is removing our tabid from browserPorts, thus breaking the attempt to postMessage in onNativeMessage above.
16:38 jeff (all in extension.js)
16:38 jeff just a guess based on the ordering of the messages in your console output.
16:44 berick jeff: it sure looks that way.  which raises the question, why is the extension getting a disconnect event..
16:44 berick confirmed my extension logs look more saner..  new -> receive -> remove
16:45 jeff I think it's the disconnect event from the previous tab reload.
16:46 jeff the disconnect event from the tab before-reload arriving / being handled "after" the post-reload connect message -- maybel.
16:46 jeff s/maybel/maybe/
16:47 alynn26 I blame Google Chrome.
16:48 berick enabling timestamps in the console log could provide some clarity
16:48 berick jeff: unable to reproduce, but i see what you are saying.
16:49 berick that would be odd, though, if the browser disconnected a tab after it reconnected the same tab
16:49 mmorgan So I have just "fixed" one of my sessions by clicking around on other screens. Went to Action triggers, opened an event def, and then cancelled. Now my screens are loading fine.
16:50 mmorgan In the extension background page, the order is remove -> new -> received for every page load
16:51 berick mmorgan: i assume you've tried a full "clear site data" on the domain?
16:51 mmorgan alynn26: I'm with you!
16:52 mmorgan berick: No, is that in dev tools?
16:52 berick mmorgan: yeah, dev tools -> Application tab -> Storage -> Clear Site Data button
16:55 mmorgan berick: Cleared site data. All pages are still loading fine.
16:58 * mmorgan thinks this is not going to get solved today.
16:59 mmorgan Hope when I boot up my computer tomorrow, I'll have another opportunity to poke.
16:59 mmorgan jeff++
16:59 mmorgan berick++
17:00 mmorgan alynn26++
17:00 mmorgan JBoyer++
17:05 mmorgan left #evergreen
18:00 pinesol News from qatests: Testing Success <http://testing.evergreen-ils.org/~live>

| Channels | #evergreen index | Today | | Search | Google Search | Plain-Text | summary | Join Webchat